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DESCRIPTION 

5 Protein C Production in Transgenic Animals 

BACKGROUND OF THE INVENTION 

Protein C in its activated form plays an 
important role in regulating blood coagulation. The 

10 activated protein C, a serine protease, inactivates 
coagulation Factors Va and Villa by limited proteolysis. 
The coagulation cascade initiated by tissue injury, for 
example, is prevented from proceeding in an unimpeded 
chain-reaction beyond the area of injury by activated 

15 protein C. 

Protein C is synthesized in the liver as a 
single chain precursor polypeptide which is subsequently 
processed to a light chain of about 155 amino acids (M^- = 
21,000) and a heavy chain of 262 amino acids (Mj- =40,000). 

20 The heavy and light chains circulate in the blood as a 
two- chain inactive protein, or zymogen, held together by a 
disulfide bond. When a 12 amino acid residue peptide is 
cleaved from the amino terminus of the heavy chain portion 
of the zymogen in a reaction mediated by thrombin, the 

25 protein becomes activated. The N-terrainal portion of the 
light chain contains nine y-carboxyglutaraic acid (Gla.) 
residues that are required for the calcium-dependent 
membrane binding and activation of the molecule. Another 
blood protein, referred to as "protein S", is believed to 

30 accelerate the protein c-cacalyzed proteolysis of Factor 
Va. 

Protein C has also been implicated in the action 
of tissue-type plasminogen activator {Kisiel et al., 
R<>hring TlT^^ Mitt. 21:29-42, 1983). Infusion of bovine 
35 activated protein C (APC) into dogs results in increased 
plasminogen activator activity (Comp et al., J- Clin. 
Invest ■ £S:1221-122B, 1991). Other studies fSakata et 
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al., prP^. Wat] Arad. .qri . USA £2:1121-1125, 1985) have 
shown that addition of APC to cultured endothelial cells 
leads to a rapid, dose-dependent increase in :;ibrinolytic 
activity in the conditioned media, reflecting increases in 
5 the activity of both urokinase-related and tissue- type 
plasminogen activators, APC treatment also results in a 
dose -dependent decrease in anti-activator activity. In 
addition, studies with monoclonal antibodies against 
endogenous APC (Snow et al . , FASEB Abstracts, 1988) 

10 implicate APC in maintaining patency of arteries during 
fibrinolysis and limiting the extent of tissue infarct. 

Experimental evidence indicates that protein C 
may" be clinically useful in the treatment of thrombosis. 
Several studies with baboon models of throinbosis have 

IS indicated that activated protein C in low doses will be 
effective in prevention of fibrin deposicio.i, platelec 
deposition and loss of circulation (Gruber et al . , 
qpmnfitasis and Thrombosi p XHH: abstract 1.512, 1983; 
Widrow et al . , Fibrinolysis 2 suppl . 1: abstract 7, 1988; 

20 Griffin et al., Thromb. Haemostasis £2: abstract 1512, 
1989) . 

In addition, exogenous activated protein C has 
been shown to prevent the coagulopathic and lethal effects 
of gram negative septicemia (Taylor et al., jL HI in. 

25 Tnvest. 12:918-925, 1987). Data obtained from studies 
with baboons suggest that activated protein C plays ' a 
natural role in protecting against septicemia. 

Until recently, protein C was purified from 
clotting factor concentrates (Marlar et al.. Blood 

30 55:1067-1072, 1982) or from plasma (Kisiel. .T. Clin. 
Inves ts , £1:761-769, 1979) and activated in vicro. 
However, the possibility that the resulting pioduct could 
be contaminated with such infectious agents as hepatitis 
virus, cytomegalovirus, or human immunodeficiency virus 

35 (HIV) maJ<e the process unfavorable. 

While expression of protein C through 
recombinant means has been theoretically possible as the 
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genes for both human and bovine protein C are known 
{Foster et al., Proc. Natl. Acai\, Sci . USfi ^2:4673-4677, 

1985; Foster et al . , proc Natl Acad_.qci- USA M:4766- 

4770, 1984 and U.S. Patent 4,775,624), it has been met 
5 with limited success. Expression of some vitamin K- 
dependent proteins, such as protein C in cultured cells, 
has not produced protein c that has been at both 
commercially valuable levels and biologically functional 
when activated (i.e. had anticoagulant activity (Grinnell 

10 et al., in Bruley and Drohn, eds . , Protein C and Rf>lat; f>d 
anticoag iil qr)^,■c;:29-63 , Gulf Publishing, Houston, TX and 
Grinnell ec al., Bio/Technol , 5:1189-1192, 1987)). 
Transgenic expression of protein C has yielded somewhat 
higher levels of expression, but the recombinant protein's 

15 anticoagulant activity has still remained low, with less 
than 50% of the material having biological activity 
(Velander et al., Proc. Natl AraH . Sri . USA £2:12003- 
12007, 1992). Therefore, there remains a need for 
producing protein C that is both expressed at high levels 

20 and has therapeutic value. 

SUMMARY OF THE INVENTION 

It is an object of the present invention to 
provide methods for producing protein C in transgenic 

25 animals. It is a further object to provide transgenic 
animals that express human protein C in a mammary gland, 

Within one aspect, the present invention 
provides methods for producing protein C in a transgenic 
animal comprising (a) providing a DNA construct comprising 

30 a first DNA segment encoding a secretion signal and a 
protein C propeptide operably lin)ced co a second DNA 
segment encoding protein C, wherein the encoded protein C 
comprises a two-chain cleavage site modified from Lys-Arg 
to R1-R2-R3-R4, and wherein each of R1-R4 is individually 

35 Lys or Arg, and wherein said first and second segments are 
operably linJced to additional DNA segments required for 
expression of the protein C DNA in a lac'tacing mammary 
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gland of a host female animal; (b) introducing said DNA 
construct into a fertilized egg of a non-human mammalian 
species; (c) inserting said egg into an oviduct or uterus 
of a female of said species to obtain offspring carrying 
S said DNA construct; (d) breeding said offsprinc; to produce 
female progeny that express said first and second DNA 
segments and produce milk containing protein C encoded by 
said second segment, wherein said protein has 
anticoagulant activity upon activation; (e) collecting 
10 milk from said female progeny; and (f) recovering the 
protein C from the milk. In one embodiment, R1-R2-R3-R4 
is Arg-Arg-Lys-Arg (SEQ ID NO: 20). In another 

embodiment, the method further comprises the step of 
activating the protein C. In another embodiment, the non- 
15 human mammalian species is selected from sheep, rabbits, 
cattle and goats. In another embodiment each cf the first 
and second DNA segments comprises an intron. In another 
embodiment, the second DNA segment comprises a DNA 
sequence of nucleotides as shown in SEQ ID NO: l or SEQ ID 
20 N0:3. In another embodiment, the additional DNA segments 
comprise a transcriptional promoter selected from the 
group consisting of casein, |}-lactoglo3ulin, a- 
lactoglobulin, a-lactalburain and whey acidic protein gene 
promoters • 

2S In another aspect, the present invention 

provides a transgenic non-human female m-ammal that' 
produces recoverable amounts of human protein C in its 
milk, wherein at least 90% of the human prote:.n C in the 
milk is two-chain protein C. 

30 In another aspect, the present invention 

provides a process for producing a transgenic cffspring of 
a mammal comprising the steps of (a) providing a DNA 
construct comprising a first DNA segment encoding a 
secretion signal and a protein C propeptide operably 

35 linked to a second DNA segment encoding protein C, wherein 
the encoded protein C comprises a two-chain cleavage site 
modified from Lys-Arg to R1-R2-R3-R4, and wherein each of 
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R1-R4 is individually hys or Arg, and wherein said first 
and second segments are operably linked to additional DNA 
segments required for expression of the protein C DNA in a 
lactating mammary gland of a host female animal; (b) 
5 introducing said DNA. construct into a fertilized egg of a 
non-human mamnalian species; and (c) inserting said egg 
into an oviduct or uterus of a female of said species to 
obtain offspring carrying said DNA construct. 

Within another aspect, the present invention 

10 provides non-human mammals produced according to the 
process for producing a transgenic offspring of a mammal 
comprising the seeps of (a) providing a DNA construct 
comprising a first DNA segment encoding a secretion signal 
and a protein C propeptide operably linked to a second DNA 

15 segment encoding protein C, wherein the encoded protein C 
comprises a two-chain cleavage site modified from Lys-Arg 
to Ri-Rj'^^S ■ ^'^^ wherein each of Ri-R^ is individually 
Lys or Arg, and wherein said first and second segments are 
operably linked to additional DNA segments required for 

20 expression of the protein C DNA in a lactating mammary 
gland of a host female animal; (b) introducing said DNA 
construct into a fertilized egg of a non-human mammalian 
species; and (c) inserting said egg into an oviduct or 
uterus of a female of said species to obtain offspring 

25 carrying said DNA construct . 

In another aspect, the present invention" 
provides a non-human mammalian embryo containing in its 
nucleus a heterologous DNA segment encoding protein C, 
wherein the encoded protein C comprises a two-chain 

3 0 cleavage site modified from Lys-Arg to R1-R2-R3-R4, and 
wherein each of Ra-R4 is individually Lys or Arg. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates analysis of plasma-derived 
35 and transgenic protein C run under non-reducing and 
reducing conditions. Lane 1 is plasma-derived protein C 
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and lane 2 is transgenic protein C from the milk of sheep 
30851. 

Figure 2 illustrates sequencing of protein C 
from sheep line 30851. The initial yields were 

5 prosequence=9 praol, light chain=563 pmol and heavy 
chain=S65 pmol . 

Figure 3 illustrates clotting activity of 
transgenic protein C compared to plasma-derived protein C. 

10 DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the invention in detail, 
it will be helpful to define certain terms used herein: 

AS used herein, Che terra "biologically active" 
is used to denote protein C that is characterized by its 

15 anticoagulant and fibrinolytic properties. Protein C, 
when activated, inactivates factor Va and factor Villa in 
the presence of phospholipid and calcium. Activated 
protein C also enhances fibrinolysis, an effect believed 
to be mediated by the lowering of the levels of 

20 plasminogen activator inhibitors. As stated previously, 
two-chain protein C is activated upon cleavaije of a 12 
amino acid peptide from the amino terminus oJ: the heavy 
chain portion of the zymogen. 

The term "egg" is used to denote an i.Jif ertilized 

25 ovum, a fertilized ovum prior to fusion of the pronuclei 
or an early stage embryo (fertilized ovum with fused' 
pronuclei) . 

A "female mammal that produces milk containing 
biologically active protein C" is one that, following 

30 pregnancy and delivery, produces, during th(! lactation 
period, milk containing recoverable amounts oi: protein C 
that can be activated to be biologically actr.ve. Those 
skilled in the art will recognized that such anirrals will 
naturally produce milk, and therefore the protein C, 

35 discontinuously. 

The term "progeny" is used in its usual sense to 
include offspring and descendants. 
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The cerm "hecerologous" is used to denoce 
genetic material originating from a different species than 
that into which it has been introduced, or a protein 
produced from such genetic material. 
5 Within the present invention, transgenic animal 

technology is employed to produce protein C within a 
mammary gland of a host female mammal. Expression in the 
mammary gland and subsequent secretion of the protein of 
interest into the milk overcomes many difficulties 

10 encountered in isolating proteins from other sources. 
Milk is readily collected, available in large quantities, 
and well characterized biochemically. Furthermore, the 
major milk. proteins are present in milk at high 
concentrations (from about 1 to 16 g/1) . 

15 From a commercial point of view, it is clearly 

preferable to use as the host a species that has a large 
milk yield. While smaller animals such as mice and rats 
can be used (and are preferred at the proof -of -concept 
stage) , within the present invention it is preferred to 

20 use livestock mammals including sheep and cattle. Sheep 
are particularly preferred due to such factors as the 
previous history of transgenesis in this species, milk 
yield, generation time, cost and the ready availability of 
equipment for collecting sheep milk. It is generally 

25 desirable to select a breed of host animal that has been 
bred for dairy use, such as East Friesland sheep, or tro 
introduce dairy stock by breeding of the transgenic line 
at a later date. In any event, animals of known, good 
health status should be used. 

30 Cloned DNA sequences encoding human protein C 

have been described (Foster and Davie, Pmr . w; ^nl . Acad. 
■^ri ■ USA £1:4766-4770, 1984; Foster et al . , Pmr. . Natl . 
Aead. USA 82:4673-4677. 1985; and Bang et al . , U.S. Patent 
4,755,624, each incorporated herein by reference). 

35 Complementary cDNAs encoding protein C can be obtained 
from libraries prepared from liver cells of various 
maramaliar. species according to standard laboratory 
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procedures. DNAs from other species, such as uhe protein 
C encoded by rats, pigs, sheep, cows and primates can be 
used and can be identified using probes from human cDNA. 

In a preferred embodiment, human genomic DNAs 
5 encoding protein C are used. The human protein C gene is 
composed of nine exons ranging in size from 25 to 885 
nucleotides, and seven introns ranging in size from 92 to 
266S nucleotides (U.S. Patent 4,959,318, incorporated 
herein by reference) . The first exon is non-coding and 

10 referred to as exon O. Exon 1 and a portion of exon II 
code for the 42 amino acid signal sequence and propeptide 
(i.e., pre-propeptide) . The remaining portion of exon II, 
exon" IXI, exon IV, exon V and a portion of exon VI code 
for the light chain of protein C. The remaining portion 

15 of exon VI, exon VII and exon VI II code for the heavy 
chain of protein C. A representative human genomic DNA 
sequence and corresponding amino acid sequence are shown 
in SEQ ID NOS: 1 and 2, respectively. A repj-esentative 
human protein C cDNA sequence and corresponding amino acid 

20 sequences are shown in SEQ ID NO: 3 and 4, respectively. 

Those skilled in the art will reco-jnize that 
naturally occurring allelic variants of these sequences 
will exist; that additional variants can be generated by 
amino acid substitution, deletion, or insertion; and that 

25 such variants are useful within the present invention. In 
general, it is preferred that any engineered variahts" 
comprise only a limited number of amino acid 
substitutions, deletions, or insertions, and that any 
substitutions are conservative- Thus, it is preferred to 

30 produce protein C polypeptides that are at least 90V, and 
more preferably at least 95V or more identical in sequence 
to the corresponding native protein. 

Withir. the present invention, the proteolytic 
processing involved in the maturation of i-ecombinant 

35 protein C from single chain form to the two-rhair. form 
(i.e., cleaved between the light chain and the heavy 
chain) has been enhanced by modifying the amino acid 
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sequence around the two-chain cleavage site. In the 
normal situation, endoproceolytic cleavage of the 
precursor molecule at the Arg2S7-Aspj5a bond and the 
removal of the dipeptide Lysx56"Argis7 by a 
5 carboxypeptidase activity generate the light and heavy 
chains of protein C prior to secretion. Expression of 
protein C with the native (Lys-Arg) two-chain cleavage 
site produces protein C that may contain up to 40% or more 
uncleaved, single-chain protein C (Grinnel et al., in 

10 Protein C and Related Anticoagulants, eds., Bruley and 
Drohan, Gulf, Houston, pp. 29-63, 1990; Suttie, Throinh . 
Bea^ 41:129-134, 1986 and Yan et al., Tr^nd^ BiocheTn. Scl . 
14:264-268, 1989). The single-chain form of protein C may 
not be able to be activated. The cleavage site may be in 

IS the form of the amino acid sequence Ri-R2-R3-R4» wherein 
each of Rl through R4 is individually lysine (Lys) or 
arginine (Arg) . Particularly preferred sequences include 
Arg-Arg-Lys-Arg (SEQ ID NO: 20) and Lys-Arg-Lys-Arg (SEQ 
ID NO: 21) . 

20 In a preferred embodiment, the present invention 

provides for recoverable amounts of human protein C in the 
milk of a non-human mammal, where at least 90%, preferably 
at least 95%, of the human protein C is two-chain protein 
C. 

25 To obtain expression in the mammary gland, a 

transcription promoter from a milk protein gene is -used.- 
Milk protein genes include those genes encoding caseins, 
beta-lactoglobulin (BLG) , a-lactalburain, and whey acidic 
protein. The beta-lactoglobulin promoter is preferred. 

30 In the case of the ovine beta-lactoglobulin gene, a region 
of at least the proximal 40S bp of 5' flanking sequence of 
the ovine BLG gene (contained within nucleotides 3844 to 
4257 of SEQ ID NO: 5) will generally be used. Larger 
portions of the 5' flanking sequence, up to about 5 kb, 

35 are preferred. A larger DNA segment encompassing the 5' 
flanking promoter region and the region encoding the 5 ' 
non-coding portion of the beta-lactoglobulin gene 
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(contained within nucleotides 1 to 4257 of SEO ID NO: 5) 
i9 particularly preferred. See Whitelaw et al., fliocheTn 
2M: 31-39, 1992. Similar fragments of promoter D^fA 
from other species are also suitable. 
5 Other regions of the beta-lactoglobulin gene may 

also be incorporated in constructs, as may gencmic regions 
of the gene to be expressed. It is generally accepted in 
the art that constructs lacking introns, f<jr example, 
express poorly in the transgenic lactating mammary gland 

10 in comparison with those constructs that contain introns 

(see Brinster et al., Proc. Wat] flcad. Sci. USA 836- 

B40, 1988; Palmiter et al., Prnr Natrl Acad. Sci. USA M: 
478-482, 1991; Whitelaw et al . , TtTrtHflgfi"'' ^- 3-13, 
1991; WO 89/01343; WO 91/02318). In this recjard, it is 

15 generally preferred, where possible, to use genomic 
sequences containing all or some of the native introns of 
a gene encoding protein C. Within certain embodiments of 
the invention, the further inclusion of at least some 
introns from the beta-lactoglobulin gene is preferred. 

20 One such region is a DNA segment which provides for intron 
splicing and flNA polyadenylation from the 3' non-coding 
region of the ovine beta-lactoglobulin gene. when 
substituted for the natural 3 ' non-coding sequences of a 
gene, this ovine beta-lactoglobulin segment can both 

25 enhance and stabilize expression levels of the protein C. 

For expression of protein C, DNA segments 
encoding protein C are operably linked to additional DNA 
segments required for their expression to produce 
expression units . One such additional segmisnt is the 

30 above-mentioned milk protein gene promoter. Sequences 
allowing for termination of transcription and 
polyadenylation of mRNA may also be incorporated. Such 
sequences are well known in the art, for example, one such 
termination sequence is the "upstream mouse sequence" 
35 (McGeady et al., Dli& a:289-298, 1986) . The expression 
units will further include a DNA segment encoding a 
secretion signal operably linked to the segment encoding 
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the protein C polypeptide chain. The secretion signal may 
be a native protein C secretion signal or may be chat of 
another protein, such as a milk protein. The term 
"secretion signal" is used herein to denote that portion 
5 of a protein that directs it through the secretory pathway 
of a cell to the outside. Secretion signals are most 
comtnonly found at the amino termini of proteins. See, for 
example, von Heinje, Nuc. ftcids Res, li,: 4683-4690, 1986; 
and Meade et al., U.S. Patent No. 4,873,316, which are 

10 incorporated herein by reference. 

Construction of expression units is conveniently 
carried out by inserting a protein C sequence into a 
plasmid or phage vector containing the additional DNA 
segments, although the expression unit may be constructed 

IS by essentially any sequence of ligations. It is 
particularly convenient to provide a vector containing a 
DNA segment encoding a milk protein and to replace the 
coding sequence for the milk protein with that of a 
protein C (including a secretion signal) , thereby creating 

20 a gene fusion that includes the expression control 
sequences of the milk protein gene. In any event, cloning 
of the expression units in plasmids or other vectors 
facilitates Che amplification of the protein C sequences. 
Amplification is conveniently carried out in bacterial 

25 (e.g. E. coli] host cells, thus the vectors will typically 
include an origin of replication and a selectable marJcfer" 
functional in bacterial host cells. 

The expression unit is then introduced into 
fertilized eggs (including early- stage embryos) of the 

30 chosen host species. Introduction of heterologous DNA can 
be accomplished by one of several routes, including 
pronuclear microinjection (e.g. U.S. Patent No. 
4,873,191), retroviral infection (Jaenisch, Science 2AS^■^ 
1468-1474, 1988) or site-directed integration using 

35 embryonic stem (ES) cells (reviewed by Bradley et al . , 
Hin/Tprhnnloay IG: 534-539. 1992) . The eggs are then 
implanted into the oviducts or uteri of pseudopregnanC 
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females and allowed to develop to term. Offspring 
carrying the introduced DNA in their germ line can pass 
the DNA on to their progeny in the normal, Mendelian 
fashion, allowing the development of transgenic herds. 
5 General procedures for producing transgenic tinimals are 
known in the art. See, for example, Hogsji et al . , 

pianipulatino the Mouse Embryo: A Laboratory Manual. Cold 

Spring Harbor Laboratory, 1986; Simons et al . , 
pio/Technoloav 179-183, 1988; Wall et 3l . , Biol ■ 

10 Reprod . 02,: 64S-€S1, 1985; Buhler et al., BiP ^ Te .C h"9lpgy 
S.: 140-143, 1990; Ebert et al., Bio/TechnQlQ<T V 9: 835-838, 
1991; Krimpenfort eC al., Bin/Technol ggy a.: 844-847, 1991; 
Wait et al., J. Cell. B iochem. 12: 113-120, 1992; and WIPO 
publications WO 88/00239, WO 90/05188, WO 92/11757; and GH 

15 87/D045B, which are incorporated herein by reference. 
Techniques for introducing foreign DNA sequences into 
naratnals and their germ cells were originally disveloped in 

the mouse. See, e.g., Gordon et al . , Prnr . Hatl . ^ad ■ 

Sci ■ USA 2Z: 7380-7384, 1980; Gordon and Ruddle, Science 

20 214 : 1244-1246, 1981; Palmiter and Brinster, QslH. 41= 343- 
345, 1985; Brinster et al., Proc. Natl Ar-ad . S.-i . USA £2: 
4438-4442, 1985; and Hogan et al . (ibid.). These 
techniques were subsequently adapted for use vrith larger 
zmimals, including livestock species (see (i.g., WIPO 

25 publications WO 88/00239, WO 90/05188, and WO 92/11757; 
and Simons et al., Bin/Terhnni r>qy £: 179-183, 1988). ' To" 
summarize, in the most efficient route used to date in the 
generation of transgenic mice or livestock, several 
hundred linear molecules of the DNA of in:erest are 

30 injected into one of the pro-nuclei of a fertilized egg. 
Injection of DNA into the cytoplasm of a zygote can also 
be employed . 

In general, female animals are superovulated by 
treatment with follicle stimulating hormone, then mated. 
35 Fertilized eggs are collected, and the heterolotious DNA is 
injected into the eggs using Icnown methods. See, for 
example, U.S. Patent No. 4,873,191; Gordon et al . , Rrac .. 
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waM ■ Acad Sri. USA 12: 7380-7384, 19B0; Gordon and 
Ruddle, fif iftnre 2ii: 1244-124€, 1981; Palmiter and 
Brinster, fffl 1 11: 343-345, 1985 ; Brinster et al . , Proc . 
|JaM ■ arad . fif?! . TTSA £2: 4438-4442. 1985; Hogan et al . , 

5 l ^^anipulat ing ^hp Mnusp KmbrvQ: A L a ^ JQ ES tOry Manu al. Cold 

Spring Harbor Laboratory, 1986; Simons et al. 
pin/Tat^hnolociv 179-183, 1983; Wall et al . , fiiflL.. 

Reprod ■ 12: G45-651, 1985: Buhler ec al - , Rin/TflchnQlogy 

£: 140-143, 1990; Ebert ec al,, Bi o/TftfrhnnlQqy 9; 835-838, 
10 1991; Krimpenfort et al . . Ri Q/Tftchnoloqy 2: 844-847, 1991; 
Wall et al., .T- Cp^II. Biochem. li: 113-120, 1992; WIPO 
publications WO 88/00239, WO 90/05118, and WO 92/11757; 
and GB 87/00458, which are incorporated herein by- 
reference . 

15 For injection into fertilized eggs, the 

expression units are removed from their respective vectors 
by digestion with appropriate restriction enzymes . For 
convenience, it is preferred to design the vectors so that 
the expression units are removed by cleavage with enzymes 

20 that do not cut either within the expression units or 
elsewhere in the vectors. The expression units are 
recovered by conventional methods, such as electro-elution 
followed by phenol extraction and ethanol precipitation, 
sucrose density gradient centrif ugation, or combinations 

25 of these approaches. 

DNA is injected into eggs essentiall-y as 
described in Hogan et al., ibid. In a typical injection, 
eggs in a dish of an embryo culture medium are located 
using a stereo zoom microscope (x50 or xS3 magnification 

30 preferred) . Suitable media include Hepes (N-2- 

hydroxyethylpiperazine-N' -2-ethanesulphonic acidi or 
bicarbonate buffered media such as M2 or Ml 6 (available 
from Sigma Chemical Co., St. Louis, USA) or synthetic 
oviduct medium (disclosed below) . The eggs are secured 

35 cind transferred to the center o£ a glass slide on an 
injection rig using, for example, a drummond pipette 
complete with capillary tube, viewing at lower (e.g. x4) 
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magnification is used at this stage. Using the holding 
pipette of the injection rig, the eggs ares positioned 
centrally on the slide. Individual eggs are sequentially 
secured to the holding pipette for injection. For each 
5 injection process, the holding pipette/egg in positioned 
in the center of the viewing field. The injection needle 
is then positioned directly below the egg. Preferably 
using x40 Nomarski objectives, both manipulcitor heights 
are adjusted to focus both the egg and the needle. The 

10 pronuclei are located by rotating the egg and adjusting 
the holding pipette assembly as necessary. Once the 
pronucleus has been located, the height of the manipulator 
is - altered to focus the pronuclear membrane. The 
injection needle is positioned below the egg sjch that the 

15 needle tip is in a position below the center of the 
pronucleus. The position of the needle is t:hen altered 
using the injection manipulator assembly to bring the 
needle and the pronucleus into the same focal plane. The 
needle is moved, via the joy stick on the injection 

20 manipulator assembly, to a position to the right of the 
egg. With a short, continuous jabbing movement, the 
pronuclear membrane is pierced to leave the needle tip 
inside the pronucleus. Pressure is applied to the 
injection needle via, for example, a glass syringe until 

25 the pronucleus swells to approximately twice its volume. 
At this point, the needle is slowly removed. .'leverting to" 
lower (e.g. x4) magnification, the injected egg is moved 
to a different area of the slide, and the process is 
repeated with another egg. 

3 0 After the DNA is injected, the eggs may be 

cultured to allow the pronuclei to fuse, producing one- 
cell or later stage embryos. In general, the eggs are 
cultured at approximately the body temperature of the 
species used in a buffered medium containing balanced 

35 salts and serum. Surviving embryos are then transferred 
to pseudopregnant recipient females, typically by 
inserting them into the oviduct or uterus, and allowed to 
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develop to term. During embryogenesis, some of the 
injected DNA integrates in a random fashion in the genomes 
of a small number of the developing embryos. 

Potential transgenic offspring are screened via 
5 blood samples and/or tissue biopsies. DNA is prepared 
from these samples and examined for the presence of the 
injected construct by techniques such as polymerase chain 
reaction (PCR; see Mull is, U.S. Patent No. 4,683,202) and 
Southern blotting {Southern, J. Mol . Riol . 5^:503, 1975; 

10 Maniatis et al.. Molecular Clonin g: A La boratory Manual. 
Cold Spring Harbor Laboratory, 1982) . Founder transgenic 
animals, or GOs, may be wholly transgenic, having 
transgenes in all of their cells, or mosaic, having 
transgenes in only a subset of cells (see, for example, 

15 Wilkie et al . , Develop. Bini . Hfl: 9-18, 1986). In the 
latter case, groups of germ cells may be wholly or 
partially transgenic. in the latter case, the number of 
transgenic progeny from a founder animal will be less than 
the expected 50% predicted from Mendelian principles. 

20 Founder GO animals are grown to sexual maturity and mated 
to obtain offspring, or Gls . The Gls are also examined 
for the presence of the transgene to demonstrate 
transmission from founder GO animals. In the case of male 
GOs, these may be mated with several non -transgenic 

25 females to generate many offspring. This increases the 
chances of observing transgene transmission. Female" GO 
founders may be mated naturally, artificially inseminated 
or fiuperovulated to obtain many eggs which are transferred 
to surrogate mothers. The latter course gives the best 

3 0 chance of observing transmission in animals having a 
limited number of young. The above -described breeding 
procedures are used to obtain animals that can pass the 
DNA on to subsequent generations of offspring in the 
normal, Mendelian fashion, allowing the development of, 

35 for example, colonies (mice), floclts (sheep), or herds 
(pigs, goats and cattle) of transgenic animals. 
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The milk from lactating GO and Gl females is 
examined for the expression of the heterolo<rous protein 
using immunological techniques such as ELISA (see Harlow 
and Lane, AnCib':?;^ ffR , — A r.ahora tory Manual . Cold Spring 
5 Harbor Laboratory, 1988) and Western bloccinc- (Towbin et 
al., Proe. Natl. Ar.ar^ . Sci . tJSA 7fi; 4350-4354, 1979). For 
a variety of reasons known in the art, exprcission levels 
of the heterologous protein will be expected to differ 
between individuals . 
10 A satisfactory family of animals should satisfy 

three criteria: they should be derived from the same 
founder GO animal; they should exhibit stable transmission 
of' the transgene; and they should exhibit acceptably 
stable expression levels from generation to generation and 

15 from lactation to lactation of individual aniruls. These 
principles have been demonstrated and discussed (Carver et 
al-. BiQ/TechnPlc?gy il: 1263-1270, 19931. /inimals from 
such a suitable family are referred to as a "line." 
Initially, male animals, GO or Gl, are used to derive a 

20 flock or herd of producer animals by natural or artificial 
insemination. In this way, many female animals containing 
the same transgene integration event can be quickly 
generated from which a supply of milk can be obtained. 

The protein C is recovered from milk using 

25 standard practices such as skimming, precipitation, 
filtration and protein chromatography techniques. " 

Protein C produced according to the present 
invention can be activated by removal of th« activation 
peptide from che amine terminus of the heavy chain. 

30 Activation can be achieved using methods th.at are well 
known in the art, for example, using a- thrombi .1 (Marlar et 
al.. Blood Sl:1067-1072, 1982), trypsin {Marlar et al., 
19B2, ibid.), Russel's viper venom factor X activator 
(Kisiel, Clin. TnvpRf £1:761-769, 1979) or 

35 commercially available Protac C (American Diagaostica, KY, 
NYI . 
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The procein C molecules provided by the present 
invention and pharmaceutical compositions thereof are 
particularly useful for administration to humans to treat 
a variety of conditions involving intravascular 
5 coagulation. For instance, although deep vein thrombosis 
and pulmonary embolism can be treated with conventional 
anticoagulants, the activated protein C described herein 
may be used to prevent the occurrence of thromboembolic 
complications in identified high risk patients, such as 

10 those undergoing surgery or those with congestive heart 
failure. Since activated protein C is more selective than 
heparin, being active in the body generally when and where 
thrombin is generated and fibrin thrombi are formed, 
activated protein C will be more effective and less likely 

15 to cause bleeding complications than heparin when used 
prophylactically for the prevention of deep vein 
thrombosis . The dose of activated protein C for 
prevention of deep vein thrombosis is in the range of 
about 100 ng to 100 mg/day, and administration should 

20 begin at least about 6 hours prior to surgery and continue 
at least until the patient becomes ambulatory. In 
established deep vein thrombosis and/or pulmonary 
embolism, the dose of activated protein C ranges from 
about 100 ^ig to 100 mg as a loading dose followed by 

25 maintenance doses ranging from 3 to 300 mg/day. Because 
of the lower likelihood of bleeding complications- from 
activated protein C infusions, activated protein C can 
replace or lower the dose of heparin during or after 
surgery in conjunction with thrombectomies or 

30 embolectomies . 

The activated protein C compositions of the 
present invention will also have substantial utility in 
the prevention of cardiogenic emboli and in the treatment 
o£ thrombotic strokes. Because of its low potential for 

35 causing bleeding complications and its selectivity, 
activated protein C can oe given to stroke victims and may 
prevent the extension of the occluding arterial thrombus. 



Printed from Mimosa 06/03/1998 13:49:50 page -19- 



wo 97/20043 



18 



PCT/US96/18866 



The amount of ac-ivaced protein C administeriid will vary 
with each patient depending on the nature and severity of 
the stroke, but doses will generally be in t.he range of 
those suggested below. 
5 Pharmaceutical compositions of activated protein 

C provided herein will be a useful treatment in acute 
myocardial infarction because of the ability ot activated 
protein C to enhance , in vitro fibrinolysis. Activated 
protein C can be given with tissue plasminogisn activator 

10 or streptokinase during the acute phases of the myocardial 
infarction. After the occluding coronary thrombus is 
dissolved, activated protein C can be given for subsequent 
days or weeks to prevent coronary reocculsion, In acute 
myocardial infarction, the patienc is given a loading dose 

15 of at least about 1-SOO rag of activated protein C, 
followed by maintenance doses of 1-100 mg/day. 

Activated protein C is useful in the treatment 
of disseminated intravascular coagulation (DIC' . Patients 
with DIC characteristically have widespread 

20 microcirculatory thrombi and often severe bleeding 
problems which resale from consumption o:: essential 
clotting factors. Because of its selectivity, activated 
protein C will not aggravate the bleeding problems 
associated with DIC, as do conventional anticoagulants, 

2 5 but will retard or inhibit the formation of additional 
microvascular fibrin deposits. " ■ " 

The invention is further illustrated by the 
following non-limiting examples. 

30 EXAMPLES 
Exarapip T 

A. Vector pMAD6 Constnirtion 

The multiple cloning site of the vector pUClB 
35 (Yanisch-Perron et ai., fiena 11:103-119, 1985) was removed 
and replaced with a synthetic double stranded 
oligonucleotide (the strands of which are shonn in SEQ ID 
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NO: 6 ar.d SEQ ID NO: 7) containing the restriction sites 
Pvu I/Mlu I/Eco RV/xba I/Pvu I/Mlu I, and flanked by 5' 
overhangs compatible with the restriction sites Eco RI and 
Hind III. pUClB was cleaved with both Eco RI and Hind 
5 III, the 5' terminal phosphate groups were removed with 
calf intestinal phosphatase, and the oligonucleotide was 
ligated into the vector backbone. The DNA sequence across 
the junction was confirmed by sequencing, and the new 
plasmid was called pUCFM- 

10 The b-lactoglobulin (BLG) gene sequences from 

pSSltgXS (disclosed in WIPO publication WO 88/00239} were 
excised as a Sal I-Xba I fragment and recloned into the 
vfector pUCPM that had been cut with Sal I and Xba I to 
construct vector pUCXS. pUCXS is thus a pOClB derivative 

15 containing the entire BLG gene from the Sal I site to the 
Xba I site of phage SSI (Ali and Clark, J. Mol . Biol. 111: 
415-426, 1988) . 

The plasmid pSSltgSE (disclosed in WIPO 
publication WO 88/0023 9) contains a 1290 bp BLG fragment 

20 flanked by Sph I and EcoR I restriction sites, a region 
spanning a unique Not I site and a single Pvu II site 
which lies in the 5' untranslated leader of the BLG mRNA. 
Into this Pvu II site was ligated a double stranded, 8 bp 
DNA linker (5 ' -GGATATCC-3 ' ) encoding the recognition site 

25 for the enzyme Eco RV. This plasmid was called 
pSSltgSE/RV. DNA sequences bounded by Sph I and "Not 
restriction sites in pSSltgSE/RV were excised by enzymatic 
digestion and used to replace the equivalent fragment in 
pUCXS. The resulting plasmid was called pUCXSRV. The 

30 sequence of the BLG insert in pUCXSRV is shown in SEQ ID 
NO: 5, with the unique Eco RV site at nucleotide 4245 in 
the 5' untranslated leader region of the BLG gene. This 
site allows insertion of any additional DNA sequences 
under the control of the BLG promoter 3 ' to the 

35 transcription initiation site. 

Using the primers BLGAMP3 (5'-TGG ATC CCC TGC 
CGG TGC CTC TGG-3'; SEQ ID NO: 8) and BLGAMP4 (5'-AAC GCG 
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TCA TCC TCT GTG AGC CAG-3'; SEQ ID NO: 9) a PCR fragment 
of approxitnaCely 650 bp was produced from sequences 
immediately 3 ' to the scop codon of Che ELG gene in 
pUCXSRV. The PCR fragment: was engineered to h;ive a BamH I 
5 site at its 5" end and an Mlu I site at its 3' end and was 
cloned as such into BamH I and Mlu I cut pGEM7zf( + ) 
(Promega) to give pDAM200 ( + ) . 

pUCXSRV was digested with Kpn 1, and the 
largest, vector containing band was gel purified. This 

10 band contained the entire pUC plasmid sequences and some 
3' non- coding sequences from the BLG gene. Into this 
backbone was ligated the small Kpn I fragment from 
pDAW200{+) which, in the correct orientation, effectively 
engineered a Bam HI site at the extreme 5' end of the 2.6 

15 Kbp of the BLG 3' flanking region. This plasmid was 
called PBLAC200. A 2.6 Kbp Cla I-Xba I fragment from 
pBLAC200 was ligated into Cla I-Xba I cut pSP72 vector 
(Promega) , thus placing an Eco RV site immediately 
upstream of the BLG sequences. This plasmid was called 

20 pBLAC2lO. 

The 2.6 Kbp Eco RV-Xba I fragment from pBLAC2lO 
was ligated into Eco RV-Xba I cut pUCXSRV to form pMAD6 
(SEQ ID NO: 23) . This, in effect, excised all coding and 
intron sequences from pUCXSRV, forming a BLG minigene 

25 consisting of 4.2 Kbp of 5' promoter and 2.6 Kbp of 3' 
downstream sequences flanking a unique Eco RV site. An 
oligonucleotide linker (ZC6a39: ACTACGTAGT; SEC ID NO: 10) 
was inserted into the Eco RV site of pMAD6 (SEQ ID NO: 
23} . This modification destroyed the Eco EV site and 

30 created a Sna BI site to be used for cloning purposes. 
The vector was designated pMAD6-Sna. Messenger RNA 
initiates upstream of the Sna BI site Eind terminates 
downstream of the Sna BI site. The precursor transcript 
will encode a single BLG-derived intron, intron 6, which 

35 is entirely within the 3' untranslated region oE the gene. 
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B. Tnfr^T^lfS'^ Vfcr.nr pMAD 

The beta-lactoglobulin cloning vector pMAD was 
also constructed to allow the insertion of cDNAs under the 
control of the beta-lactoglobulin gene promoter in 
5 constructs containing no introns. To generate pMAD, the 
plasmid pBLAClOO was opened by digestion with both Eco RV 
and Sal I . The vector fragment was gel purified and the 
linearized vector was iigated with the 4.2 kb promoter 
fragment from the plasmid pOCXSRV as a Sal I-Eco RV 

ID fragment. The resulting construct was designated pSTl and 
constitutes a beta-lactoglobulin mini-gene encompassing a 
4,2 Jtb of promoter region and 2.1 kb o£ 3' non-coding 
region beginning immediately downstream of the beta- 
lactoglobuling translational termination codon. A unique 

15 Eco RV site allows blunt -end cloning of any additional DMA 
sequences. To generate transgenic animals it is generally 
accepted in the art and preferred to separate bacterial 
plasmid vector sequences from those intended to be used in 
the generation of transgenic animals. In order to allow 

20 the practical excision of novel cDNA based constructs 
using this beta-lactoglobulin mini-gene, the minigene was 
excised from pSTl on a Xho I -Not I fragment, the DNA 
termini made flush with Klenow polymerase and the product 
was Iigated into the Eco RV site of pUCPM to yield pMAD. 

25 Digestion with Mlu 1 liberates beta-lactoglobulin-cDNA 
constructs from the bacterial vector backbone. 

Intronless constructs based on cDNAs and vectors 
such as pMAD benefit from the use of "rescue technology" 
for efficient expression. Rescue technology takes 

30 advantage of the ability of a co- injected and co- 
integrated BLG gene to improve the expression levels 
obtained from intronless, cDNA-based constructs in the 
transgenic system. Rescue technology is disclosed in WIPO 
publication WO 92/11358, and is incorporated herein by 

35 reference. 
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Kxample 2 

A. TsqTation of cDNft 

A cDNA seguence coding for human protein C was 
prepared as described in U.S. Patent 4,959,318, which is 
5 incorporated herein by reference, Briefly, a genomic 
fragment containing an exon corresponding to amino acids - 
42 to -19 (SEQ ID NO: 1) of the pre-pro peptide of protein 
C was isolated, nick translated and used as a probe to 
screen a cDNA library constructed by the tschnigue of 

10 Gubler and Hoffman. Gene Z3,:263-269, 1983, using mRNA from 
HepG2 cells. This cell line was derived from human 
hepatocytes and was previously shown to synthesjize protein 
C (-Fair and Bahnak, Elood £1:194-204, 1984). Positive 
clones comprising cDNA inserted into the Eco Rl site of 

15 phage Xgtll were isolated and screened with an 
oligonucleotide probe corresponding to the 5' non-coding 
region of the protein C gene. One clone was a.'.so positive 
with this probe and its entire nucleotide sequence was 
determined. The cDNA contained 70 bp of 5' untranslated 

20 sequence, the entire coding seguence for hu.-nan prepro- 
protein C, and the entire 3' non-coding region 
corresponding to the second polyadenylation site. 

B. .qnhrlnninQ of Profein C cUm 

The vector pDX was derived from pD3, which was 
25 generated from plasmid pDHFRIII (Berkner et al . , Hut*. 
Aci, jg Rf.s. il;a41-857. 1985}. The Pst 1 site itiunediately- 
upstream from the DHFR sequence in pDHFRIII was converted 
to a Bel I site by digestion with Pst I. The DNA was 
phenol extracted, ethanol precipitated and reJiuspended in 
30 buffer B (50 mM Tris pH 8, 7 mM MgCl2, 7 mM P-MSH) . A 
ligation reaction containing the linearized plasmid DNA 
and Bel I linkers was done. The resulting plasmid was 
phenol extracted, ethanol precipitated and digested with 
Bel I and gel purified. The gel purified plasmid DNA was 
35 circularized by ligation and used to transform E, coli 
HBlOl. Positive colonies were identified by restriction 
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analysis and designated pDHFR' . DNA from positive 

colonies was isolated and used to transform dam" £j cflli. 

Plasmid pD2' was generated by cleaving pDHFR' 
and pSV40 (comprising Bam HI digested SV40 DNA cloned into 
5 the Bam Kl site of pML-1 (Lusky et al., Wat-nrff 22a-.79-ai, 
1981)) with Bel I and Bam HI. The DNA fragments were 
resolved by gel electrophoresis, and the 4.9 )tb pDHFR' 
fragment and 0.2 ]tb SV40 fragment were isolated. These 
fragments were used in a ligation reaction, and the 
10 resulting plasmid, designated pD2', was used to transform 

Plasmid pD2 ' was modified by deleting the 
"poison" sequences in the pBR322 region (Lus)?y ec al . , 
1981, ibid.). Plasmids pD2 ' and pML-l were digested with 

15 Eco RI and Nru I. The 1.7 kh pD2 ' fragment and 1.8 kb 
pML-1 fragment were isolated by gel purification, 
circularized in a Ligation reaction and used to transform 
F ml i HBIOI. Positive colonies were identified using 
restriction analysis (designated pD2) and digested with 

20 Eco RI and Bel I. A 2.8 kh fragment (fragment C) was 
isolated and gel purified. 

To generate the remaining fragments used in 
constructing pD3, pDHFRiii was modified to convert the Sac 
II {Sst ID site into either a Hind III or Kpn I site. 

25 pDHFRIII was digested with Sst II and ligation reactions 
with either Hind III or Kpn I linkers were done. - Thfe 
resultant plasmids were digested with either Hind III or 
Kpn I and gel purified. The resultant plasmids were 
designated either pDHFRllI (Hind III) or pDHFRIII (Kpn 1) . 

30 A 700 bp KpnI-Bgl II fragment (fragment A) was purified 
from pDHFRIII (Hind III) . 

The SV4 0 enhancer sequence was inserted into 
pDHFRIII {Hind III) by first digesting SV40 DNA with Hind 
III, and DN'A from 5069 to 968 bp was isolated and 

35 purified. Plasmid pDHFRIII (Hind III) was phosphatased, 
and the SV40 DNA and linearized plasmid pDHFRIII (Hind 
III) were used in a ligation reaction. A 700 bp Eco RI- 
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Kpn I fragment (fragment B) was isolated from the 
resulting plasmid. 

For the final construction of pD3, fragments A 
(SO ng) , B (50 ng) and C (10 ng) were combined in a 
5 ligation reaction and used to transfcnn E. coll RRI. 
Positive colonies were isolated and plasmid DNA was 
prepared . 

Plasmid pD3 was modified to accept the insertion 
of the protein C sequence by converting the Bel I 

10 insertion site to an Eco RI site. First, the Eco RI site 
present in pD3 (the leftmost terminus in adenovirus 5 0-1) 
was converted to a Bam HI site via conventions.! linkering 
procedures. The resultant plasmid was transformed in 
coli HBIOI. Plasmid DNA was prepared, and posi.tive clones 

15 were identified by restriction analysis. 

pD3 ' is a vector identical to pD3 except that 
the SV40 polyadenylacion signal (i.e., the SViO Bam HI 
(2533 bp) to Bel I (2770 bp) fragment) is in the late 
orientation. Thus, pD3' contains a Bam HI siite as the 

20 sice of gene insertion. 

To generate pDX, the Eco RI site in pD3 ' was 
converted to a Bel I site by Eco RI cleavage, incubation 
with SI nuclease and subsequent ligation vrith Bel I 
lin)cers. DNA was prepared from a positively identified 

25 colony, and a 1.9 kb Xho I-Pst I fragment containing the 
altered restriction site was prepared via gel 
purification. In a second modification, Bel I-cleaved pD3 
was ligated with Eco Rl-Bcl I adapters in order to 
generate an Eco RI sice as the position for inserting a 

30 gene into the expression vector. Positive colonies were 
identified by restriction analysis. The resulting 
plasmid, designated pDX, has a unique Eco RI site for 
insertion of foreign genes. 

The protein C cDNA was inserted into pDX as an 

35 Eco RI fragment. Plasmids were screened by restriction 
analysis. A plasmid having the protein C insiert in the 
correct orientation with respect to the promoter elements 
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and plasmid DNA was designated pDX/PC. Because Che cDNA 
insert in pOX/PC contains a ATG codon in the 5' non-coding 
region, deletion mutagenesis was performed on the cDNA. 
Deletion of the three base pairs was performed according 
5 to standard procedures or oligonucleotide-directed 
mutagenesis. The pDX-based vector containing the modified 
cDNA was designated p594. 

C. Modification of the Prnfein C p-roeeasing Site 
10 To enhance the processing of single -chain 

protein C to the two-chain form, two additional arginine 
residues were introduced immediately upstream of the 
Lysiss-Argi57 cleavage site of the precursor protein, 
resulting in a cleavage site consisting of four basic 
15 amino acids, Arg-Arg-Lys-Arg (SEQ ID NO: 20) . The 
resultant mutant precursor of protein C was designated 
PC962. It contains the sequence Ser-His-Leu-Arg-Arg-Lys- 
Arg-Asp (SEQ ID NO: 22) at the cleavage site. Processing 
at the Arg-Asp bond results in a two-chain protein C 
20 molecule. 

The mutant molecule was generated by altering 
the cloned cDNA by site-specific mutagenesis (essentially 
as described by Zoller and Smith, DM i:479-4BB, 1984, for 
the two-primer method) using the mutagenic oligonucleotide 

2S ZC962 |5 'aGTCACCTGAGAAGAAAACGAGACA^ ' ; SEQ ID NO: 11). 
Plasmid p594 was digested with Sst I, and- the 
approximately 87 bp fragment was cloned into M13mpll and 
single-stranded template DNA was isolated. Following 
mutagenesis, a correct clone was identified by sequencing. 

30 Replicative form DNA was isolated, digested with Sst I, 
and the protein C fragment was inserted into Sst I -cut 
pS94. Clones having the Sst I fragment inserted in the 
desired orientation were identified by restriction enzyme 
mapping. The resulting expression vector was designated 

35 pDX/PC962. 
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C. TnH'onT-ess Protein C CnnstrucC 

To facilitate the cloning of the pro;ein C cDNA, 
PC962, into pMAD, the cDNA contained in prx/PC962 was 
tnodified to incorporate Eco RV sites at the ex:remities of 
5 the protein C cDNA insert. A 769 bp Sst ll-Pst I fragment 
encompassing the 3 ' end of PC962 was cloned between the 
Sst II and Pst I sites of pBluescript II SK® (Stratagene, 
La Jolla, CA^ . The fragment was excised with Sst II and 
Eco RV and purified. The 5' portion of PC962 was modified 

10 by PGR. The sense oligonucleotide primei: for this 
reaction covered the 5 ' ATG region of thii cDNA and 
provided an Eco RV site upstream of this in t he product . 
The - antisense oligonucleotide primer covered the Sst II 
site used to generate the Sst 1 1 -Eco RV fra<jment. The 

15 resulting PGR product was digested with Eco RV and Sst II 
and ligated with the Ssc II-Eco RV 3' fragment and Eco RV 
digested pMAD. The resulting plastnid, designated pCORP9 
effectively contained the PC962 cDlJA flanked by Eco RV 
sites in an intronless fusion driven by the beta- 

20 lactoglobulin promoter. 

E. Qpnomic Protein C DNA Construction 

A genomic DNA construct containing exons I 
through VIII was wade. See, U.S. Patent 4,955,318, which 
is incorporated herein by reference, for disclcsure of the 

25 exon structure of the protein C gene. Tliis genomic 
construct, designated GPClO-1, changed the nequence' IS 
base pairs upstream of the ATG from the nativj protein C 
sequence to the beta-lactoglobulin sequence and introduced 
mutations in the propeptide cleavage site located in exon 

30 2, and the two-chain cleavage site located in exon 6, as 
described below. 

The construct was assembled using four fragments 
designated A, B, C and D and encompassed the protein C 
gene sequence from the ATG to a Bam HI site in exon VIII, 

35 immediately upstream of the stop codon. Thu fragments 
were generated from a human genomic library in ^ Charon 4A 
phage that was screened with a radiolabeled cDMA probe for 



Printed from Mimosa 06/03/1998 13:49:50 page -28- 



, wo 97/10043 



28 



PCT/V3S96/18866 



NO: 17) , which introduced an Afl II site and the RRKR 
mutation of the native (KR) two-chain cXeavagt: site, in a 
polymerase chain reaction. The resulting PCR-generated 
fragment was digested with Bgl II and a£1 II and gel 
5 purified, resulting in a 1443. base paiir fragment, 
designated El. Fragment El was used in a ligation 
reaction with oligonucleotides ZC6302 (SEQ ID NO: 18) and 
ZCe304 (SEQ ID NO: 19) . These oligonucleotic.es form Afl 
II and Sst II restriction sites when anneal-sd and were 

10 ligated co the 3' end of fragment El, resulting in a 
fragment with a 5" Bgl II site and a 3' Sst II site. This 
fragment was used in a ligation reaction with ii Bam Hl-Sst 
II - digested Bluescript II KS® phagenid vector 
(Stratagene) . The resulting plasmid was designated GPC 8- 

15 S and digested with Sst I and Sst II, generating a 626 
base pair fragment, designated Fragment C. 

A fourth fragment was generated by digestion of 
a genomic subclone (pHCB7-ll of PCX.8. pHCB7-l contained a 
Bgl II to Bgl II fragment that encompassed exons VI 

20 through VIII. pHCB7-l was digested with Sst i:: and Bam HI 
and a 2702 base pair fragment was gel purified. The 
fragment was designated Fragment D. 

A five-part ligation reaction was prepared using 
Not I and Bam HI digested and linearized Bluescript II KS® 

25 phagemid vector (Stratagene) with Fragment A (5' Not I to 
3' Eco RI) that contained exons I and II, Fragment B- (S'" 
Eco RI to 3" Sst I) that contained exons III. IV and V, 
Fragment C fS' Sst I to 3' Sst II) that contained the S' 
portion of exon VJ and Fragment D (5' Sst II to 3' Bam HI) 

30 that contained the remaining 3' portion of exon VI and 
exons VII and viii. The resulting DNA was 89SO base pairs 
and designated GPC 10-1. 

GPClQ-1 was originally generated with BLG 
sequences and a Not 1 site upstream of the Al'G initiator 

3S codon and modifications to both cleavage sites. A clone, 
designated pPC12/BS, was generated to ensure that the 5' 
Not I site of GPClO-l would not introduci; secondary 
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structure into mRNA molecules that could hinder 
translation. pPCl2/BS was generated using PCR 

araplif ication of a J. kb Not I-Sca I fragment that covered 
the S' region of the protein C gene and contained the 
S wild-type ATG codon environment. This introduced an Sco 
RV site immediately downstream of the Not I site, adjacent 
to the ATC3 codon, and a Bam HI site was incorporated 3 ' of 
the Sea I site to facilitate cloning. Following a Not 
I/Bam HI digestion, the PCR product was cloned into Not I- 

10 Bam HI digested Bluescript II KS® phagemld vector 
(Stracagene) . The Not I-Eco HV-Sca I fragment present in 
PPC12/BS was excised, purified and ligated to GPClO-1, 
which had been linearized with Not I and partially 
digested with Sea I [the pUC ampillicin gene has an 

15 internal Sea I sice) . The resulting clone was designated 
GPClO-2 and possesses an Eco RV site immediately upstream 
of the ATG initiator codon. 

GPClO-1 and GPClO-2 both terminated at the final 
Bam HI site in exon VIII of the protein C gene. To 

20 reconstitute the 56 bp of sequence, ending at the 
termination codon, two oligonucleotides were synthesized 
with flanking Bam HI (S') and Bgl II (3') restriction 
sites. Following annealing of the oligonucleotides, the 
product was cloned into Bam HI digested pBST+ to generate 

25 plagmid pPC3 ' . pBST+ is a derivative of pBS (Stratagene) 
with a new polylinker. The addition of the polylinker- 
added Bgl II, Xho I, Nar I and Cla I restriction sites 
from the vector polylinker downstream of the destroyed Bgl 
II site of the oligonucleotide construct. 

30 The Not I-Bam HI fragment of GPClO-1 was 

subcloned into Not I/Bam HI digested pPC3 ' to add 3' 
coding sequences of protein C, the TAG termination codon 
followed by Bgl Il-Xho I -Nar I -Cla I. The 3' region of 
the protein c gene beginning with the Eco RV site in 

35 intron V was excised from this plasmid on an Eco RV-Cla I 
fragment . 
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The Eco RV-Eco RV fragment from GPClO-2, 
covering Che 5' portion of the protein C gene, and the 
above Eco Rl-Cla I fragment covering the 3' portion of the 
protein C gene were combined between the Eco RV and Cla I 
5 sites of pMAD6 <SEQ ID NO: 23) to generate pC0RP13 . This 
effectively placed a genomic portion of the prctein C gene 
with modified propeptide and two- chain clccivage sites 
under the control of the beta-lactoglobuLin promoter. 

A further genomic construct was gensrated from 

10 pC0RP13 that contained only the modified two-chain 
cleavage site. This was achieved using PGR amplification 
to modify two fragments which resulting in restoration of 
the 'coding capability of exon 2 from the mutant Gln-Arg- 
Arg-Lys-Arg (SEQ ID NO: 251 to the wild-type P.rg-Ile-Arg- 

IS Lys-Arg (SEQ ID KO: 24). pC0RP13 was used as template for 
these reactions. The first fragment was 1.3 kb, which 
encompassed the S' end of the protein C gene up to the Bam 
HI site in exon 2 . For this reason, the sense primer was 
designed to add a Hind III site 5' to the Eco RV site 

20 proximal to the ATG initiation codon. The antisense 
primer was designed to rescore the wild-type sequences in 
exon 2, which included a restored Bam HI site. A second 
fragment of 0.2 kb from the Bam HI site in exon 2 to the 
Xho 1 site in intron 2, was also amplified. The two 

25 fragments were combined in pC3EMIl (Promega, Miidison, WI) 
to generate pGEMPCl.5. A 7.5 kh Xho I fragment from pCORP 
13 was ligated to Xho I digested pGEMPCl.S to generate a 
complete protein C genomic sequence covering exons 1-S 
with a wild-type propeptide cleavage site and a modified 

3 0 two-chain cleavage site. The plasmid was designated 
pGEMPCl4 . The sequence was excised from pGEMPC14 as a 
Hind Ill/Sal I fragment. The DNA termini were repaired 
using a Klenow reaction and the fragment was blunt -end 
ligated into Eco RV digested pMAD6 (SEQ ID SO: 23) to 

35 generate pC0RPl4. 
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Example 3 

Mice for initial breeding stocks ICS7BL6J, 
CBACA) were obtained from Harlan Olac Ltd. (Bicester, UK) . 
These were mated in pairs to produce Fl hybrid cross 
5 (B6CBAF1) for recipient females, superovulated females, 
stud males and vasectomized males. All animals were kept 
on a 14 hour light/10 hour dark cycle and fed water and 
food (Special Diet Services RM3 , Edinburgh, Scotland) ad 
libitum. 

10 Transgenic mice were generated essentially as 

described in Hogan et al.. Manipulating the Mouse Ernbry o;. 

Lahnfat-.ory Manual . Cold Spring Harbor Laboratory, 1986, 
which is incorporated herein by reference in its entirety. 
Female BSCBAfl animals were superovulated at 4-5 weeks of 

15 age by an i.p. injection of pregnant mares' serum 
gonadotrophin (FOLLIGON, Vet-Drug, Falkirk, Scotland) (S 
iu} followed by an i.p. injection of human chorionic 
gonadotrophin (CHORULON, Vet-Drug, Falkirk, Scotland) (5 
iu) 45 hours later. They were then mated with a stud male 

20 overnight. Such females were next examined for copulation 
plugs. Those that had mated were sacrificed, and their 
eggs were collected for microinjection. 

DNA was injected into the fertilized eggs as 
described in Hogan et al. (ibid,). Briefly, the vector 

25 containing the protein C expression unit was digested with 
Mlu I, and the expression unit was isolated by sacrbse 
gradient centrifugation. All chemicals used were reagent 
grade (Sigma Chemical Co., St. Louis, MO, U.S.A.), and all 
solutions were sterile and nuclease-f ree . Solutions of 

30 20i and 40V sucrose in 1 M NaCl, 20 itiM Tris pH 8.0, 5 mM 
EDTA were prepared using UHP water and filter sterilized. 
A 30% sucrose solution was prepared by mixing equal 
volumes of the 20% and 40% solutions. A gradient was 
prepared by layering D.S ml steps of the 404, 30V and 20% 

35 sucrose solutions into a 2 ml polyallomer tube and allowed 
to stand for one hour. lOO \il of DNA solution (max. 8 jig 
DNA) was loaded onto the top of the gradient, and the 
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gradient was centrifuged for 17-20 hours at 26,000 rpra, 
15°C in a BecJcraan TLIOO ultracencrifuge usir.g a TLS-55 
rotor (Beckman Instruments, Fullerton, CA, USA) . 
Gradients were fractionated by puncturing the tube bottom 
5 with a 20 ga. needle and collecting drops in a 96 well 
microtiter plate. 3 ^il aliquots were analyzed on a IV 
agarose mini-gel. Fractions containing the protein C DMA 
fragment were pooled and ethanol precipitated overnight at 
-20»C in 0.3M sodium acetate. DNA pellets were resuspended 

10 in 50-100 ^l UHP water and quantitated by lluorimetry. 
The protein C expression unit was diluted in Dulbecco's 
phosphate buffered saline without calcium and magnesium 
(corttaining, per liter, 0,2 g KCl, 0.2 g KH;P04, 8.0 g 
NaCl, 1.15 g Na2HP04) or in TE (10 mM Tris-HCl, 1 mM EDTA 

IS pH 7.S1 . DNA concentration is adjusted to about G jig/ml, 
prior to injection into the eggs ("2 pi total CNA solution 
per egg) . 

Recipient females of 6-8 weeks of age are 
prepared by mating BSCBAFl females in natural estrus with 

20 vasectomized males. Females possessing copul-ition plugs 
are then kept for transfer of microinjected eggii. 

Following birth of potential transgenic animals, 
tail biopsies are taken, under anesthesia, at four weeks 
of age. Tissue samples are placed in 2 ml of tail buffer 

25 (0.3 M Na acetate, 50 m>5 NaCl, 1.5 mM MgClj, 10 mM Tris- 
HCl, pH B.5. 0,5% NP40, 0.5% Tween 20) containing '200 ' 
/ig/ml proteinase K (Boehringer Mannheim, Mannheim, 
Germemy) and vortexed. The samples are shaken (250 rpm) 
at SSo-eo^c for 3 hours to overnight, DNA prepared from 

30 biopsy samples is examined for the presence of the 
injected constructs by PGR and Southern blotting. The 
digested tissue is vigorously vortexed, and 5 ill aliquots 
are placed in 0.5 ml microcentrifuge tubes. Positive and 
negative tail samples are included as controls. Forty (il 

35 of silicone oil (BDH, Poole, UK) is added to each tube, 
and the tubes are briefly centrifuged. The tubes are 
incubated in the heating block of a thermal cycler (e.g. 
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Omni-gene, Hybaid, Teddingcon, UK) Co 95''C for 10 minutes. 
Following this, each tube has a 45 11 aliquot of PCR mix 
added such that the final composition of each reaction mix 
is: SO mM KCl; 2 mM MgClj; 10 mM Tris-HCl (pH 8.3); O.OlV 
S gelatin; O.lt NP40, 10% DMSO; 500 nM each primer, 200 IM 
dNTPs; 0.02 U/11 Taq polymerase <Boehringer Mannheim, 
Mannheim, Germany) . The tubes are then cycled through 30 
repeated temperature changes as required by the particular 
primers used . The primers may be varied but in all cases 

10 must target the BLG promoter region. This is specific for 
the injected DNA fragments because the mouse does not have 
a BLG gene. Twelve 11 of 5x loading buffer containing 
Orange G marJcer dye (0.25% Orange G (Sigma) 15% Flcoll 
type 400 (Pharmacia Biosystems Ltd., Milton Keynes, OK)) 

IS is then added to each tube, and the reaction mixtures are 
electrophoresed on a 1.S4 agarose gel containing ethidium 
bromide (Sigma) until the marker dye has migrated 2/3 of 
the length of the gel. The gel is visualized with a UV 
light source emitting a wavelength of 254 nm. Transgenic 

20 mice having one or more of the injected DNA fragments are 
identified by this approach. 

Positive tail samples are processed to obtain 
pure DNA. The DNA samples are screened by Southern 
blotting using a BLG promoter probe (nucleotides 2523-4253 

25 of SEQ ID NO: 7) . 

Southern blot analysis of transgenic - mide 
prepared essentially as described above demonstrated that 
approximately 10% of progeny contained protein C 
sequences. Examination of milk from positive animals by 

30 reducing SDS polyacrylamide gel electrophoresis 
demonstrated the presence of protein C at concentrations 
up to 1 mg/ml. 

Ryample 4 

35 Donor ewes are treated with an intravaginal 

progesterone-impregnated sponge (CHRONOGEST Goat Sponge, 



Printed from Mimosa 06/03/1998 13:49:50 page -35- 



wo 97/20<343 



l'CT/US9fi/J886<i 



34 

Intervet, Cambridge, UK) on day 0. Sponges are left in 
situ for ten or twelve days. 

SuperovuXation is induced by treatm<:nt of donor 
ewes with a total of one unit of ovine follicle 
5 stimulating hormone (OFSH) (OVAGEN, Horizon Animal 
Reproduction Technology Pty. Ltd., Kew Zealand) 
administered in eight intramuscular injectio.'is of 0.125 
units per injection starting at 5:00 pm on day -4 and 
ending ac 8:00 am on day 0. Donors ace injected 

10 intramuscularly with 0.5 ml of a luteoJ.ytic agent 
(ESTRUMATE, Vet-Drug) on day -4 to cause regre:;sion of the 
corpus luteura, to allow return to estrus and ovulation. 
To ■Synchronize ovulation, the donor animals are injected 
intramuscularly with 2 ml of a synthetic releasing hormone 

15 analog (RECEPTAL, Vet -Drug) at 5:00 pm on day 0. 

Donors are starved of food and wsiter for at 
least 12 hours before artificial insemination {A. I.). The 
animals are artificially inseminated by intrauterine 
laparoscopy under sedation and local anesthesia on day l. 

20 Either xylazine (ROMPUN, Vet-Drug) at a dose rate of 0.05- 

0. 1 ml per 10 leg bodyweight or ACP injection 10 mg/ml 
(Vet-Drug) at a dose rate of 0.1 ml per 10 leg bodyweight 
is injected intramuscularly approximately fifteen minutes 
before A.I. to provide sedation. A.I. is carried out 

25 using freshly collected semen from a Poll Dorset ram. 
Semen is diluted with equal parts of filtered phosphate' 
buffered saline, and 0.2 ml of the diluted semen is 
injected per uterine horn. Immediately pre- or post-A.I., 
donors are given an intramuscular injection of AMOXYPEN 

30 (Vet-Drug) . 

Fertilized eggs are recovered on day 2 following 
starvation of donors o£ food and water from 5:00 pm on day 

1. Recovery is carried out under general anesthesia 
induced by an inrravenous injection of S* thiopentone 

35 sodium (INTRAVAL SODIUM, Vet-Drug) at a dose rate of 3 ml 
per 10 kg bodyweight. Anesthesia is maintained by 
inhalation of 1-2% Halothane/02/N20. To :recover the 
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fertilized eggs, a laparotomy incision is made, and the 
uterus is exteriorized. The eggs are recovered by- 
retrograde flushing of Che oviducts with Ovum Culture 
Medium {Advanced Protein Products, Brierly Hill, West 
5 Midlands, UK) supplemented with bovine serum albumin of 
New Zealand origin. After flushing, the uterus is 
returned to the abdomen, and the incision is closed. 
Donors are allowed to recover post-operatively or are 
euthanized. Donors that are allowed to recover are given 
10 an intramuscular injection of Amoxypen L,A. at the 
manufacturer's recommended dose rate immediately pre- or 
post-operatively. 

Plasmids containing the protein C DMA are 
digested with Mlu I, and the expression unit fragments are 
IS recovered and purified on sucrose density gradients. The 
fragment concentrations are determined by fluorimetry and 
diluted in Dulbecco's phosphate buffered saline without 
calcium and magnesium or TE as described above. The 
concentration is adjusted to € Ig/tnl, and approximately 2 
20 pi of the mixture is microinjected into one pronucleus of 
each fertilized eggs with visible pronuclei. 

All fertilized eggs surviving pronuclear 
microinjection are cultured in vitro at 3B.5''C in an 
atmosphere of 5V C02:5% O2:90V N2 and about _100% humidity 
25 in a bicarbonate buffered synthetic oviduct medium (see 
Table) supplemented with 20V v/v vasectomized ram serum: 
The serum may be heaC inactivated at Sfi'C for 3 0 minutes 
and stored frozen at ~2Q'>C prior to use. The fertilized 
eggs are cultured for a suitable period of time to allow 
3 0 early embryo mortality (caused by the manipulation 
techniques) to occur. These dead or arrested embryos are 
discarded. Embryos having developed to 5 or e cell 
divisions are transferred to synchronized recipient ewes. 
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Synthetic Oviduct Medium 



10 



15 



20 



25 



30 



35 



40 



45 



50 



Stnnk ft (Lasrs 1 MonrJij) 

NaCl 

KCl 

KH2PO4 

MgS04.7H20 
Penicillin 

Sodium Lactate 60V syrup 
Super HjO 



6.29 g 
0.534 g 
0.162 g 
0.182 g 
0.06 g 
0.6 mis 
99.4 mis 



NaHC03 




0 


.21 g 


Phenol red 




0 


.001 


3 


Super HjO 




10 


mis 




stork C (Lasrs ?. 










Sodium Pyruvate 




0, 


.051 


3 


Super H2O 




10 


mis 




.Stook D (Lasrs 1 


months) 








CaC12 . 2H2O 




0, 


.262 


? 


Super HjO 




10 


mis 




.qtork R a-asrB T 


monrh.ct) 








Hepes 




0, 


.651 


■5 


Phenol red 




0, 


,001 


■3 



Super H2O 



10 mis 



To make lip inml.<i rtf Birarbonata Buffered... 
Mftrt 1 nm 

STOCK A 1 ml 

STOCK B 1 ml 
STOCK C 0.07 ml 

STOCK D 0.1 ml 

Super H2O 7. 83 mi 

Osmolarity should be 265-285 roOsm. 

Add 2.5 ml of heat inactivated shesp serum 

and filter sterilize. 

TO make, up 10 mis of VF.PER RnfFRrft-l Mftdtum 

STOCK A 1 ml 

STOCK B 0.2 ml 

STOCK C 0.07 m.1 

STOCK D 0.1 ml 

STOCK E O.B ml 

Super H20 7.83 ml 
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TahlP. f:pnr . 

Osmolarity should be 265-285 mOsm. 
5 Add 2.5 ml of heat inactivated sheep serum 

and filter sterilize. 



Recipient ewes are treated with an incravaginal 
progesterone -impregnated sponge (Chronogest Ewe Sponge or 
10 Chronogest Ewe-Lamb Sponge, Intervet) left in situ for 10 
or 12 days. The ewes are injected intramuscularly with 
1.5 ml (300 iu) of a follicle stimulating hormone 
substitute (P.M.S.C3., Intervet) and with 0.5 ml of a 
luteolytic agent (Escrumate, Coopers PiCman-Moore) at 
15 sponge removal on day -l. The ewes are tested for estrus 
with a vasectomiEed ram between 8:00 am and 5:00 pm on 
days 0 and 1. 

Embryos surviving in vitro culture are returned 
to recipients (starved from 5:00 pm on day S or fi) on day 
2 0 6 or 7. Embryo transfer is carried out under general 
anesthesia as described above. The uterus is exteriorized 
via a laparotomy incision with or without laparoscopy. 
Embryos are returned to one or both uterine horns only in 
ewes with at least one suitable corpora lutea. After 
25 replacement of the uterus, the abdomen is closed, and the 
recipients are allowed to recover. The animals are given 
an intramuscular injection of Amoxypen L.A. at the 
manufacturer's recommended dose rate immediately pre- or 
post-operatively. 
30 Lainbs are identified by ear tags and left with 

their dams for rearing. Ewes and lambs are either housed 
and fed complete diet concentrates and other supplements 
and or ad lib. hay, or are let out to grass. 

Within the first week of life (or as soon 
35 thereafter as possible without prejudicing health) , each 
lamb is tested for the presence of the heterologous DNA by 
two sampling procedures. Following tail biopsy, within a 
week, a 10 ml blood sample is taken from the jugular vein 
into an EDTA vacutainer. Tissue samples are taken by tail 



Printed from Mimosa 06/03/1998 13:49:50 page -39- 



wo 97/20043 



38 



l>CT/l>S96/ia866 



biopsy as soon as possible after the tail has become 
desensitized after the application of a rubber elastrator 
ring to its proximal third (usually within 200 minutes 
after "tailing") . The tissue is placed immediately in a 
5 solution of tail buffer. Tail samples are kept at room 
temperature and analyzed on the day of collection. All 
lambs are given an intramuscular injection of Amoxypen 
L.A. at the manufacturer's recommended dose rate 
immediately post -biopsy, and the cut end of -he tail is 

10 sprayed with an antibiotic spray. 

DNA is extracted from sheep blood by first 
separating white blood cells. A 10 ml sample of blood is 
dililted in 20 ml of HanJc's buffered saline (HE-S; obtained 
from Sigma chemical Co.). Ten ml of the diluted blood is 

15 layered over 5 ml of Histopaque (Sigma) in each of two 15 
ml screw-capped tubes. The tubes are centrifuged at 3000 
rpm (2000 X g max.), low bra)«e for 15 minutes at room 
temperature. White cell interfaces are removed to a clean 
IS ml tube and diluted to IS ml in UBS. The diluted cells 

20 are spun at 3000 rpm for 10 minutes at room temperature, 
and the cell pellet is recovered and resuspended in 2-5 ml 
of tail buffer. 

To extract DNA from the white cells, 10% SDS is 
added to the resuspended cells to a final concentration of 

25 1%, and the tube is inverted to mix the solutisn. One mg 
of fresh proteinase K solution is added, and the mixture 
is incubated overnight at 4 5°C. DMA is extracted using an 
equal volume of phenol/chloroform (x3) and 
chloroform/isoamyl alcohol (xl) . The DNA is then 

30 precipitated by adding 0.1 volume of 3 M NaOAc and 2 
volumes of ethanol, and the tube is inverted to mix. The 
precipitated DNA is spooled out using a clean glass rod 
with a sealed end. The spool is washed in TOV ethanol, 
and Che DNA is allowed to partially dry, then is 

35 redissolved in TE tlO mM Tris-HCl, 1 mM EDTA, pH 7.5). 
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DKA samples from blood and tail are analyzed by 
Southern bloc-ing using probes for the BLG promoter region 
and the protein c coding regions. 

A founder female animal, designated 308S1, which 
is transgenic for both BLG and pCORPS was generated. She 
has given rise to two sons and a transgenic daughter, 
designated 40387. Recombinant transgenic protein C was 

10 purified from milk (from 30851) by a single chromatography 
step using a calcium-dependent monoclonal antibody 
affinity column. Briefly, the milk samples were pooled up 
to a volume of 40 ml . Two volumes of ice-cold 1 X TBS (50 
mM Tris-HCl, 150 mM NaCl pH 6.5) and 200 mM EDTA, pH 6.5 

15 were added co solubilise the caseins. The EDTA-treated 
milk solution was centrifuged at 15,000 rpm for 30 minutes 
at i'C in a JA20 rotor (Beckman Instruments, Irvine, CA) . 
After centrifugation, the upper lipid phase and the small 
pellet were discarded. 

20 The EDTA-treated milk was diluted with an equal 

volume of ice-cold 1 X TBS and 133 mM CaCl2 while 
stirring. A cloudy precipitate formed upon addition of 
the CaClj. The pH was quickly adjusted by addition of a 
few drops of 4 M NaOH, and the precipitate was 

25 redissolved. Any remaining insoluble material was removed 
by filtration through a 0.45 filter. " " 

The optical density of the solubilised milk was 
measured at 280 nm, and the protein concentration was 
calculated. The milk was diluted to a protein 

30 concentration of 10 mg/ml using 1 X TBS containing CaCl2 
to give a final Ca++ concentration of 25 mM. The milk was 
used to resuspend antibody-Sepharose that carried the 
immobilized Ca**- dependent monoclonal antibody PCL-2, and 
had been washed in i X TBS and 25 mM CaCl2" PCL-2 is a 

35 monoclonal antibody that binds single chain and two chain 
protein C, whether or not they are gamma -carboxylated. 
The milk-Sepharose mixture was incubated overnight at 4°C. 
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The matrix was washed twice in batch with 1 x 
TBS and 25 mM CaCl2 and packed into a glass column. The 
resin was washed at a flow rate of 1 ml/rain with a calcium 
containing buffer and a stable baseline vas achieved 
S before the bound protein was eluted with an isocratic 
elution using 1 X TBS and 25 mM EDTA, pH 6.5. Fractions 
containing protein C were pooled and concentrated to 
approximately 1 ml using an Aiiiicon ultrafiltration unit 
with a 10 kDa cut-off membrane (Amicon, Danveri;, MA) . 

10 The monoclonal antibody, PCL-2, was coupled to 

the activated Sepharose 4B as follows: 1 g (3.5 ml of gel) 
of cyanogen bromide activated Sepharose 4B (Pharmacia LKB 
BicJtechnology, Piscataway, NJ) was swollen fo:.' 15 minutes 
in I mM HCl . The swollen gel was resuspendsd in 0.1 M 

15 NaHCQ3, 0.5 M NaCl pH 8.3 and washed several times. The 
washed gel was resuspended in li ml of monoclonal antibody 
solution (PCL-2, 3.5 mg/ml in bicarbonate bui'fer pH B.3) 
with a coupling ratio of approximately 10 mg/ml gel. 
Coupling was allowed to proceed for 2 h at room 

20 temperature on a rotary mixer, and the gel was recovered 
by gentle centrifugation . The monoclonal supernatant was 
removed and replaced by l M ethanolaraine in order to block 
any remaining sites or. the Sepharose. Blocking was 
performed overnight at 4''C. Excess adsorbed protein was 

25 removed by sequential acid and alkali washes (0.1 M 
acetate, 0.5 M NaCl pH 4.0; 0.1 M NaHCOa , 0.5 M NaCi pH' 
6.3), and the coupled gel was stored in 50 mM Tris-HCl, 
150 mM NaCl pH 6.5, 0.02* azide. 

30 Example 6 

Samples of purified recombinant transgenic 
protein C were compared with plasma-derived prrotein C and 
a plasma -derived activated protein C (APC) preparations. 
Samples were run on SDS PAGE 4 -20V acrylamide gradient 
35 gels under reducing conditions and silver stained for 
protein . 
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The plasma-derived material shows the presence 
of a heavy-chain doublet around 44 kDa (Figure 1, Lane 1) . 
This has been reported to be due to partial occupancy of 
the three possible N-linked glycosylation sites on the 
5 molecule. A similar doublet, although of a slightly lower 
mass presumably due to some subtle change in glycosylation 
profile, has also been seen with the transgenic protein C. 
The light chain was visible around 22 kDa for both 
preparations. Significantly, in the case of the plasma- 

10 derived material uncleaved single-chain was clearly 
visible above the heavy chain doublet. Plasma-derived 
protein normally contained S-10 percent of this inactive 
material. In contrast, the transgenic protein C contains 
no obvious single chain by this gel analysis. Therefore, 

IS it contains less than a few percent at most of inactive 
material. This most likely reflects the increased 
efficiency of cleavage of the modified inter-chain site. 
In further support of this observation no single chain was 
visible by direct western blot analysis of transgenic 

20 sheep milk (40387, expression level 300 ^g/ml) . 

The purified transgenic protein C was further 
characterized as follows: 
A. ELUSA 

An enzyme -linked iminunosorbent assay (ELISA) for 
25 protein C was done as follows: Affinity-purified 
polyclonal antibody to human protein C (100 ^1 of 1 ^ig/ntl 
in 0.1 M Na2C03, pH 9.6) was added to each well of a 9G- 
well microtiter plate, and the plates were incubated 
overnight at 4''C. The wells were then washed three times 
30 with phosphate buffered saline (PBS) containing 0.05V 
Tween-20 and incubated with 100 ^1 of 1% bovine serum 
albumin (BSA) , 0.05% Tween-20 in PBS at 4''C overnight. The 
plates were then rinsed several times with PBS, 'air dried 
and stored at 4''C. To assay samples, 100 \il of each sample 
35 was incubated for 1 h at SV'C with a biotin-conjugated 
sheep polyclonal antibody to protein C (30 ng/ml) in PBS 
containing 1% BSA and 0.05V Tween-20. After incubation. 
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the wells were rinsed with PBS, and alkaline phosphatase 
activity was measured by the addition of 100 ^1 of 
phosphatase substrate (Sigma, SC. Louis, MO) In 10% 
diethanolamine , pH 9.8, containing 0.3 mM MgCl2 • The 
5 absorbance at 405 nm was read on a microliter plate 
reader. Quantitation was by comparison with a standard 
curve using plasma-derived protein C quantitated by amino 
acid analysis. 

10 B. Ami no -Terminal Sequencing 

Amino-terminal sequencing of the transgenic 
material was performed to ascertain the extent of 
proseguence removal and to evaluate the presence of gamma - 
carboxylation. There were three possible N-terminal 

15 sequences of protein C. These were: 1) Prosec-uence which 
directs gamma-carboxylation and could have remained on the 
light chain if the first cleavage site was i.ncompletely 
processed, 2) the light chain and 3) the heavy chain. N- 
terminal sequencing of protein C obtained froit transgenic 

20 milk should have contained only the latter two sequences 
if correct processing had occurred at both of the cleavage 
sites. Amino-terminal sequencing would have also been 
expected to reveal the presence of gamma-carbcxylation in 
the light chain. There are nine sites of carbcxylacion in 

25 the first twenty-nine amino acids of the light chain. On 
an analysis of released amino acids, the PTH-gamtna • 
carboxylic acid derivatives eluted from the HPLC column in 
the break-through and could therefore be analyzed. Thus, 
a gamma carboxylic acid showed up on the amino-terminal 

30 sequence as a space rather than a glutamic acid. 

The yields of amino acids in pmol released from 
Che sequencing of approximately 27 pmol (1.4 |il) of 
purified transgenic protein C corresponded we!.l to those 
expected for an equimolar mixture of light and heavy 

35 chains, and no obvious sequence was discernible for the 
prosequence. Moreover, no other aberrant sequences were 
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detected. thus indicating a lack of inappropriate 
proteolytic cleavages. 

As stated previously, gamma-carboxylated 
glutamate residues were expected to sequence as blanks 
5 using standard instrument conditions. However, sequencing 
protein C gives a double sequence which must be 
deconvoluted using knowledge of the expected light and 
heavy chain sequences. Normally, if the light chain alone 
were sequenced the gla residues at positions six and seven 

10 would appear as blanks. However when sequenced as intact 
protein C, the heavy chain sequence contains a glutamate 
residue at position six. Therefore, the only indirect 
confirmation of the presence of a gla residue in the light 
chain was the absence of glutamate at position seven which 

15 was not 'over written' by a glutamate in the heavy chain 
(Figure 2) . Two other indirect confirmations of the 
presence of gamma carboxylation of the transgenic product 
are described below. 

20 C. Mass Analysis of tho Purified Light Chain 

The protein sequence of the transgenic-derived 
protein C precursor had been modified with an Arg-Arg-Lys- 
Arg (SEQ ID NO: 20) cleavage site between the light and 
heavy chains to promote more efficient cleavage of the 

25 single chain to 2 -chain form. Western blot analysis of 
the transgenic protein C milk and examination of the 
purified protein C on reducing gels had already confirmed 
that efficient cleavage had occurred. Normally during 
secretion, but after cleavage of the plasma-derived 

30 material, the two basic amino acids at the carboxy- 
terminus of the light chain are trimmed back by a basic 
carboxypeptide. Establishing whether the carboxy- terminus 
of the transgenic protein C light chain had been processed 
to remove the two extra basic amino acids introduced by 

35 this modification, as well as the two natural ones, was 
achieved by measuring the mass of the purified light chain 
in a quadropole instrument using on-line liquid 
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chromatography and electro-spray ionization. In order to 
achieve this, all of the cysteine residues of protein C 
were reduced and alkylated, and then the two chains were 
separated by reversed -phase chromatography. 

5 

CI. Reductive Alkvl ;^f,^ f>n 

Because protein C is heavily cross-linked for a 
molecule of approximately 52 kDa, with twelve disulfide 
bridges (17 of the 24 cysteines involved are in the light 

10 chain) , it was necessary to reductively alkylate the 
entire protein before attempting to separate the chains by 
reversed-phase chromatography. In view of the large 
number of cysteines in the light chain, alkyLatacion was 
done with iodoacetamide, in place of the more commonly 

15 used vinyl pyridine, to prevent the molecule from becoming 
excessively hydrophobic. 

The transgenic protein C cnaterial (6 nmol of 
protein or 144 pmol of thiol) was reductively alkylated as 
follows: 0.5 mg of protein C (by ELISA) in 0.5 ml of TBS 

20 was added to 50 jil of 1 M Iris pH 8.0, 450 fii. water, 570 
mg guanidinium chloride, and 10 (il at 50 mg/ml DTT (0.3 
mol representing a 20 fold excess of added thiol over 
cysteine thiol . The mixture was incubated f oi- 2 hours at 
ITC. After incubation, 20 pi at 120 mg/ml iodoacetamide 

25 (0.6 M representing a 2 fold excess over DTT on a molar 
basis) was added, and the mixture was incub.ated in" the" 
dark for one hour at 4*0. The reaction was quenched by 
adding 50 ^1 at 50 mg/ml DTT representing a 2.5 fold 
excess over iodoacetamide. The sample (final volume 1.5 

30 ml) was stored at -20°C until analysis. 

D. Puri f ifaf ion nf the Light rhayn 

Purification of protein C light chain was 
achieved using a large pore polystyrene column with 
3S divinyl benzene interactive groups (PLRP-S, 4000A, e^m, 
2.1 mm ID: Polymer Laboratories, Shropshire. UK). The 
optimum conditions for separation of the heavy and light 



Printed from Mimosa 06/03/1998 13:49:50 page -46- 



wo 97/10043 



45 



PCT/US96/ISM6 



chains were determined co be: solvent A (O.l* TFA) and 
solvent E (lOOV acetonitrile) at a flow of 0.5 ml/min with 
a detector wavelength of 215 nm and a gradient of 30 to 
60% solvent B over 60 min. 
S Fractions were collected across the eluted 

peaks, and samples (10 jil) were analyzed by SDS PAGE on 4- 
20'i gradient acrylamide gels under non-reducing 
conditions. The light chain (fractions 3 to 5) was 
completely resolved from both the heavy chain (fractions 7 
10 to 9) and a single fraction (6) which contained a mixture 
of heavy chain and what appeared to be unglycosylated 
light chain. 

A sample containing fully resolved light chain 
was prepared for deglycosylation by centrifugal 

15 evaporation under reduced pressure at room temperature. 
Deglycosylation was carried out using peptide N-glycanase 
(PNGase; Oxford Glycosystems, Oxford, UK) . The protein 
sample was redissolved in SO jiL of buffer and incubated 
overnight with i unit (5 nD PNGase, according to 

20 manufacturer's specifications. 

The light chain was purified from reduced and 
alkylated plasma-derived protein C by the same method and 
deglycosylated for further analysis. 

25 E. Analysis bv Mass Snpcfrn.-;rnpy 

Samples of purified light chain were subjectesi 
to mass analysis using a liquid chromatography 
electrospray interface co a Sciex Quadropole Mass Analyser 
(Sciex/Perkin Elmer, Toronto, CA) . The LC system used a 
30 0.5 mm ID column packed with PLRP-S 4000A, Sjim resin 
(Polymer Laboratories) , The solvent system contained 
buffer A (O.iV formic acid), buffer B (O.lV formic acid 
and a 5:2 (v/v) mixture of ethanol to propan-l-ol) . The 
gradient used was from 5-60% buffer B over 35 minutes at a 
35 flow rate of 25 p.1 per minute. The outflow of the column 
was linked via a UV detector to the mass spectrometer 
which was run in posit ive-iori mode. 
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The purified and deglycosylated tranjsgenic light 
chain was analyzed and gave a relatively weak spectrum 
which was reconstructed to give two components with masses 
of 18,911.0 and 18,971.0. The plasma light chain was also 
S analyzed and gave a stronger signal with a single major 
component. The spectrum of the plasma lighl: chain was 
reconstructed to give a single mass of 18,970.0. 

The predicted mass for the light cha.in carrying 
nine gamma -carboxy glutamic acids, one P-hydrcxy aspartic 

10 acid and seventeen carbamidomethyl cysteine residues and 
ending with Leuisg was 189SS.9723, which is very close to 
the masses detected for the transgenic (18,971.0) and 
plasma-derived (18,970.0) light chains. The small 

differences in mass were well within the accuracy 

15 limitations for this instrument, particularly with the LC 
delivery. This shows that the mass of the redirectively- 
alJcylated and deglycosylated transgenic light chain is 
essentially identical to that for the plasma -derived 
protein C. This implies that both molecules have 

20 undergone the same post-translational modifications and 
that the transgenic material is fully gamma ca.rboxylated, 
has had all four basic amino acids trimmed bac)c from the 
carboxy- terminus of the light chain and ha.-s single (J 
hydroxy-alanine . 

2S 

F. Act-iviry Meaaiirpmpnr 

The activity of the transgenic protein C was 
compared with that of the plasma-derived material in a 
coagulation assay. First each sample of protein C, 

30 quantitated by amino acid composition analysis, was 
activated by incubation with Protac, a snake venom 
(American Diagnostica Inc, Greenwich, CT) at a venom to 
protein ratio of 1 Unit Protac: lO jig protein C for 60 
minutes at 3 7''C. Aliquots of the activated material were 

3 5 then compared for their ability to prolong the clotting 
time of protein C depleted human plasma (Diagnostic 
Reagents Ltd) in the presence of activated partial 
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thromboplastin time reagent - cephalin from rabbit brain 
(Sigma) and calcium using a mechanical coagulometer 
(Diagnostica Stago, Asmieres, FR) . A comparison of 
clotting times with various additions of transgenic and 
5 plasma-derived protein C [Figure 3) shows Chat the two 
preparations had the same anti -coagulant activity per mg 
of protein. 

In summary, results show that the sheep-derived 
transgenic protein C is correctly post-translationally 

10 processed, with respect to gawma-carboxylation and 
probably beta-hydroxylation, and has anticoagulant 
activity fully equivalent to a high quality purified 
plasma standard. The results demonstrate that the C- 
terminal processing of the light chain, with the modified 

IS RRKR cleavage site rather than the naturally occurring KR 
site, has the two extra basic amino acids removed along 
with the natural ones. 

From the foregoing, it will be appreciated that, 
20 although specific embodiments of the invention have been 
described herein for purposes of illustration, various 
modifications may be made without deviating from the 
spirit and scope of the invention, Accordingly, the 
invention is not limited except as by the appended claims. 

25 
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SEQUENCE LISTING 



(U GENERAL INFORHATIOfJ: 

(i) APPLICANTS: ZymoGenetKS. Inc. 

1201 Eastlake Avenue East 

Seattle 

WA 

USA 

98102 

PPL Therapeutics 

Roslin 

Edinburgh 

Scotland 

UK 

EH25 9PP 

(ii) TITLE OF INVENTION: PROTEIN C PRODUCTION IN TRANSGENIC 
ANIMALS 

NUMBER OF SEQUENCES: 25 

(IV) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ZymoGenetlcs, Inc. 
(6) STREET: 1201 Eastlake Avenue East 
(C) CITY: Seattle 
CD) STATE: WA 

(E) COUNTRY: USA 

(F) ZIP: 98102 

(V) COMPUTER READABLE FORM; 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC comoatible 

(C) OPERATING SYSTEM: PC-DOS/HS-DOS 

(0) SOFTWARE. PatentIn Release #1.0. Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viil) ATTORNEY/AGENT INFORMATION 

(A) NAMf: Sawislak. Deborah A 
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(B) REGISTRATION NUMBER; 37.438 

(C) REFERENCE/DOCKET NUMBER: 95-28PC 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 206-442-6672 

(B) TELEFAX; 206-442-6678 



(2) INFORMATION FOR SEO ID N0;1: 

(i) SEQUENCE CHARACTERISTICS; 
(A) LENGTH: 11725 base pairs 
(8) T7PE: nucleic acid 
(C) STRANDEDNESS: double 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: joinC3520. .3530. 5093.. 5117. 5210.. 5347. 5450 

..5584. 8253.. B395. 9269.. 9386. 10516.. 11102) 



Exi) SEQUENCE DESCRIPTION; SEQ ID N0:1: 

AGTGAATCT6 GGCGAGTAAC ACAAAACHG AGTGTCCHA CCTGAAAAAT AGAGGTTAGA 60 

GGGATGCTAT GTGCCAHGT GTGTGTGTGT TGGGGGTGG6 GATTG6GGGT GATnGTGAG 120 
CAATTG6AGG TGAGGGTGGA GCCCAGTGCC CAGCACCTAT 6CACTGGGGA CCCAAAAAGG " 180' 

AGCATCTTCT CATGAmTA TGTATCAGAA ATTGGGATGG CATGTCATTG GGACAGCGTC 240 

TTTTTTCTTG TATGGTGGCA CATAAATACA TGTGTCHAT AAHAATCGT ATTTTAGAn 300 

TGACGAAATA TGGMTATTA CCTGTTGTGC TGATCHGGG CAAACTATAA TATCTCTGGG 360 

CAAAAATGTC CCCATCTGAA AAACAGGGAC AACGHCCTC CCTCAGCCAG CCACTATGGG 420 

GCTAAAATGA GACCACATCT GTCAAGGGH nGCCCTCAC CTCCCTCCCT 6CT6GAT6GC 460 

ATCCnCGTA GGCAGAGGTG GGCTTCGGGC AGAACAAGCC GTGCTGAGCT AG6ACCAGGA 540 
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GTGCTAGTGC 


CACTGTnGT 


CTATGGAGAG 


GGAGGCCTCA 


GTGCTGA6GG CCAAG(jiWAT 


600 


ATTTGTGGTT 


ATGGATTAAC 


TCGAACTCCA 


GGCTGTCATG 


GCGGCAGGAC GGCGA/^CTTG 


660 


CA6TATCTCC 


ACGACCCGCC 


CCTGTGAGTC 


CCCCTCCAGG 


CAGGTCTATG AGGG6TGTGG 


720 


AfiGGAGGGCT 


GCCCCCGGGA 


GAAGAGAGCT 


AGGTGGTGAT 


GA66GCTGAA TCCTCCIAGCC 


780 


AGGGTGCTCA 


ACAAGCCTGA 


GCnCGGGTA 


AAAGGACACA 


AGGCCCTCCA CAGGCCAGGC 


840 


CTGGCAGCCA 


CAGTCTCAGG 


TCCCTTTGCC 


ATGCGCCTCC 


CTCTTTCCAG GCCAAGGGTC 


900 


CCCAGGCCCA 


GGGCCATTCC 


AACAGACAGT 


TTGGAGCCCA 


GGACCCTCCA HCTCCCCAC 


960 


CCCACnCCA 


CCmGGGGG 


TGTCGGATTT 


GAACAAATCT 


CA6AAGCGGC CTCAGAGGGA 


1020 


GTCGGCAAGA 


A1GGA6AGCA 


GGGTCCGGTA 


GG6TGTGCAG 


AGGCCACGTG GCCTAICCAC 


1080 


TGGGGAGGGT 


TCCriGATCT 


CTGGCCACCA 


GG6CTATCTC 


TGTGGCCTTT TGGAGCAACC 


1140 


TGGTGGTrTG 


6GGCAGQG6T 


TGAATTTCCA 


GGCCTAAAAC 


CACACAGGCC TGGCC1T6AG 


1200 


TCCT6GCTCT 


GCGAfiTAATG 


CAT6GATGTA 


AACATSiAGA 


CCCAGGACCT TGCCTCAGTC 


1260 


TTCCGAGTCT 


GGTGCCTGCA 


GTGTACTGAT 


GGTGTGAGAC 


CCTACTCCTG 6AGGATGGGG 


1320 


GACA6AATCT 


GATC6ATCCC 


CTGGGnGGT 


GACTTCCCTG 


TGCAATCAAC GGAGACCAGC 


1380 


AAGGGHGGA 


1 1 1 1 1 AATAA 


ACCACnAAC 


TCCTCCGAGT 


CTCAG7TTCC CCCTCTATGA 


1440 


AATGGGGTTG 


ACAGCATTAA 


7AACTACCTC 


nGGGTGGTT 


GTGAGCCTTA ACTGAA3TCA 


1500; 


TAATATCTCA 


TGTTTACTGA 


GCATGAGCTA 


TGTGCAAAGC 


CTGTnTGAG AGCTTTATGT 


1560 


GGACTAACTC 


CTTTAAnCT 


CACAACACCC 


ITTAAGGCAC 


AGATACACCA CGTTATrCCA 


1620 


TCCATTTTAC 


AAATGAGGAA 


ACTGAGGCAT 


CGA6CAGTTA 


AGCATCnGC CCAACAITGC 


1680 


CCTCCAGTAA 


GTGCTGGAGC 


TGGAATTT6C 


ACCGTGCAGT 


CTGGCITCAT GGCCTGXCT 


1740 


GTGAATCCTG 


TAAAAATTGT 


nGAAAGACA 


CCATGAGTGT 


CCAATCAACG TTAGCT.WTA 


1800 


TTCTCAGCCC 


AGTCATCAGA 


CCGGCAGAGG 


CAGCCACCCC 


ACTGTCCCCA GGGAGGm 


1860 


AAACATCCTG 


GCACCCTCTC 


CACTGCAITC 


TGGAGCTGCT 


TTCTAG6CAG GCAGTGrGAG 


1920 
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CTCAGCCCCA CGTAGAGCGG GCAGCCGAGG CCHCTGAGG CTATGTCTCT AGCGAACAAG 1980 

GACCCTCAAT TCCAGCTTCC GCCTGACGGC CAGCACACAG GGACAGCCCT TTCATTCCGC 2040 

TTCCACCTGG G6GTGCAGGC AGAGCAGCAG CGGGG6TAGC ACT6CCCGGA GaCAGAAGT 2100 

CCTCCTCAGA CAGGTGCCAG TGCCTCCAGA ATGTGGCAGC TCACAAGCCT CCTGCTG7TC 2160 

GTGGCCACCT GGGGAATTTC CGGCACACCA GCTCCTCTTG GTAAGGCCAC CCCACCCCTA 2220 

CCCCGGGACC CTTGTG6CCT CTACAAGGCC CTGGTGGCAT CTGCCCAGGC CTTCACAGCT 2280 

TCCACCATCT CTCTGAGCCC TGGGTGAGGT GAGGGGCAGA TGGGAATGGC AGGAATCAAC 2340 

TGACAAGTCC CAGGTAGGCC AGCTGCCAGA GTGCCACACA GGGGCTGCCA GGGCAGGCAT 2400 

GC6T6ATGGC AGGGAGCCCC GCGATGACCT CCTAAAGCTC CCTCCTCCAC ACGGGGATGG 2460 

TCACAGAGTC CCCTGGGCCT TCCCTCTCCA CCCACTCACT CCCTCAACTG TGAAGACCCC 2520 

AGGCCCAGGC TACCGTCCAC ACTATCCAGC acagcctccc ctactcaaat gcacactggc 25ao 

CTCATGGCTG CCCTGCCCCA ACCCCTTTCC TGGTCTCCAC AGCCAACGGG AGGAGGCCAT 2640 

GATTCTTGGG GAGGTCCGCA GGCACATGGG CCCCTAAAGC CACACCAGGC TGnGGTTTC 2700 

ATTTGTGCCT HATAGAGCT GTTTATCTGC nGGGACCTG CACCTCCACC CTTTCCCAAG 2760 

GTGCCCTCAG CTCAGGCATA CCCTCCTCTA GGATGCCTTT TCCCCCATCC CTTCTTGCTC 2820 
ACACCCCCAA CHGATCTCT CCCTCCTAAC TGTGCCCTGC ACCAAGACAG ACACTTCACA - 2880' 

GAGCCCAGGA CACACCTGGG GACCCTTCCT GGGTGATAGG TCTGTCTATC CTCCAGGTGT 2940 

CCCT6CCCAA GGG6AGAAGC ATGGGGAATA CnGGTTGGG G6AGGAAAGG AA6ACTGGGG 3000 

6GAT6TGTCA AGATGGGGCT GCATGTGGTG TACTGGCAGA AGAGTGAGAG GATTTAACn 3060 

GGCAGCCm ACAGCAGCAG CCAGGGCHG AGTACHATC TCTGGGCCAG GCTGTATTGG 3120 

ATGHTTACA TGACGGTCTC ATCCCCATGT nrrGGATGA GTAAATTGAA CCTTAGAAAG 3180 

GTAAAGACAC TGGCTCAAGG TCACACAGAG ATCGGGGT6G GGHCACAGG GAG6CCTGTC 3240 
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CAiaCAGAG CAAGGCnCG TCCTCCMCT GCCATCTGCT TCCTGGGGWi GAAA/iGAGCA 3300 

GAGGACCCCT GCGCCAAGCC ATGACCTAGA ATTAGAATGA GTCHGAGGG GGCGCiAGACA 3360 

AGACCnCCC AG6CTCTCCC AGCTCTGCF CCTCAGACCC CCTCATGGCC CCAGCCCCTC 34?0 

TTAGGCCCCT CACCAAGGTG AGCTCCCCTC CCTCCAAAAC CAGACTCAGT GTTnCCAGC 3480 

AGCGAGCGTG CCCACCAGGT 6CTGCGGATC CGCAAACGT GCC AAC TCC HC CTG 3534 

GAG GAG CTC CGT CAC AGC A6C CTG SAG CGG GAG TGC ATA GAG GAG ATC 358? 

TGT 6AC nc GAG GAG GCC AAG GAA AH TIC CAA AAT GIG CAT GAC ACA 3630 



GTAAGGCCAC 


CATGGGTCCA 


GAG6ATGAGG 


CTCAGGGGCG 


AGCTGGTAAC 


CAGCAGGGGC 


3690 


CTC6AGGAGC 


AGGTGGGGAC 


TCAATGCTGA 


GGCCaCTTA 


GGAGTTGIGG 


GGGTGGCTGA 


3750 


GTGGAGCGAT 


TAGGATGCTG 


GCCCTATGAT 


GTCGGCCAGG 


CACATGTGAC 


TGCAAGAAAC 


3810 


AGAAHCAGG 


AAGAAGCTCC 


AGGAAAGAG1 


GTGGGGTGAC 


CCTAG6TGGG 


GACTCCCACA 


3S70 


GCCACAGTGT 


AGGTGGnCA 


GTCCACCCTC 


CAGCCACTGC 


TGAGCACCAC 


TGCCTCCCCG 


3930 


TCCCACCTCA 


CAAAGAGGGG 


ACCTAAAGAC 


CACCCTGCn 


CCACCCATGC 


CTCTGCTGAT 


3990 


CAGGGTGTGT 


GTGTGACCGA 


AACTCACTTC 


TGTCCACATA 


AAATCGCTCA 


CTCTGrCCCT 


4050 


CACATCAAAG 


G6AGAAAATC 


TGAHCnCA 


GGGGGTCGGA 


AGACAGGGTC 


TGTGTXTAT 


4110 


nCTCTAAGG 


GTCAGAGTCC 


nTGGAGCCC 


CCAGAGTCCT 


GTGGACGTGG 


CCCTA36TAG 


fl7Q. 


TAGGGTGAGC 


HGGTAACGG 


GGCTGGcrrc 


CTGAGACAA6 


GCTCAGACCC 


GCTCT3TCCC 


4230 


TGGGGATCGC 


nCAGCCACC 


AGGACCTGAA 


AAHCTGCAC 


GCCTGGGCCC 


ccttc;aagg 


4290 


CATCCAGGGA 


TGCTTTCCAG 


TGGAGGCTTT 


CAGGGCAGGA 


GACCCTCT6G 


CCTGC!\CCCT 


4350 


CTCTTGCCCT 


CA6CCTCCAC 


CTCCTTGACT 


G6ACCCCCAT 


CTG6ACCTCC 


ATCCCZACCA 


4410 


CCTCmCCC 


CAGTGGCCTC 


CCTGGCAGAC 


ACCACA6TGA 


CITTCTGCAG 


GCACATATCT 


4470 


GATCACATCA 


AGTCCCCACC 


GTGCTCCCAC 


CTCACCCATG 


GTCTCTCAGC 


cccag:agcc 


4530 


TTGGCTGGCC 


TCTCT6ATGG 


AGCAGGCATC 


AGGCACA6GC 


CGTGGGTCTC 


AACGTGGGCT 


4590 
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GGGTGGTCCT GGACCAGCAG CAGCCGCCGC AGCAGCMCC CTGGTACCTG GTTAGGAACG 4650 

CAGACCCTCT GCCCCCATCC TCCCAACTCT GAAAAACACT GGCHAGGGA AAGGCGCGAT 4710 

GCTCAGGGGT CCCCCAAAGC CCGCAGGCAG AGGGAGTGAT GGGACTGGAA 6GAGGCCGAG 4770 

TGACHGGTG AGGGAHCGG GTCCCHGCA TGCAGAGGCT GCTGTG6GAG CGGACAGTCG 4830 

CGAGAGCAGC ACTGCAGCTG CATGGGGAGA GGGTGTTGCT CCAGGGACGT GGGATGGA6G 4890 

CTGGGCGCGG 6CGGGTGGCG CTGGAGGGCG GGGGAGGGGC AG6GAGCACC AGCTCCTAGC 4950 

AGCCAACGAC CATCGGGCGT CGATCCCTGT TTGTCTGGAA GCCCTCCCCT CCCCTGCCCG 5010 

CTCACCCGCT GCCCTGCCCC ACCCGGGCGC GCCCCTCCGC ACACCGGCTG CAGGAGCCTG 5070 

ACGCTGCCC6 CTCTCTCCGC AG CTG GCC TTC TG6 TCC AAG CAC GTC G 5117 

GTGAGTGCGT TCTAGATCCC CGGCTGGACT ACCGGCGCCC GCGCCCCTCG GGATCTCTGG 5177 

CCGCT6ACCC CCTACCCCGC CnGTGTCGC AG AC GGT GAC CAG TGC TTG GTC 5229 

TTG CCC m GAG CAC CCG TGC GCC AGC CTG TGC TGC GGG CAC GGC ACG 5277 

TGC ATC GAC GGC ATC GGC AGC HC AGC TGC GAC TGC CGC AGC GGC TGG 5325 

GAG GGC CGC TTC TGC CAG CGC G GTGAGGGG6A GAGGTGGATG CTGGCG6GCG 5377 

GCGGGGCGGG GCTGGGGCCG GGTTGGGGGC GCGGCACCAG CACCAGCTGC CCGCGCCCTC 5437 

CCCTGCCCGC AG AG GTG AGC nc CTC AAT TGC TCT CTG GAC AAC GGC •5484'" 

GGC TGC ACG CAT TAC TGC CTA GAG GAG GTG GGC TGG CGG CGC TGT AGC 5532 

TGT GCG CCT GGC TAC AAG CTG GGG GAC GAC CTC CTG CAG TGT CAC CCC 5580 

GCA G GTGA6AAGCC CCCAATACAT CGCCCA6GAA TCACGCTGGG TGCGGGGTGC 5634 

GCAGGCCCCT GACGGGCGCG GCGCGGGG6G CTCAGGAGGG THCTAGGGA GGGAGCGAGG 5594 

AACAGAGHG AGCCHGGGG CAGCGGCAGA CGCGCCCAAC ACCGGGGCCA CTGTTAGCGC 5754 

AATCAGCCCG GGAGCTGGGC GCGCCCTCCG CTHCCCTGC HCCTTTCTT CCTGGCGTCC 5814 
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CCGCnCCTC 


CGG6CGCCCC 


T6CGACCTGG GGCCACCTCC 


TGGAGCGCAA 


GCCCA3TGGT 


5874 


GGCTCCGCTC 


CCCAGTCTGA 


GCGTATCTG6 GGCGAGGC6T 


GCAGCGTCCT 


CCTCCATGTA 


5934 


GCCTGGCTGC 


GIIIIICTCT 


GACGHGTCC GGCGTGCATC 


GCAITTCCCT 


cttta:cccc 


5994 


nGCTTCCn 


GAGGAGAGAA 


CAGAATCCCG AHCTGCCn 


CTTCTATATT 


TTCCTmTA 


605'1 


TGCATTTTAA 


TCAAATTTAT 


ATATGTATGA AACTTTAAAA 


ATCAGAGrrr 


TACAACTCTT 


6114 


ACACTnCy<G 


CATGCTGHC 


CTTGGCAT6G 6TCCIII 1 1 1 


CAnCATTTT 


CATAA,\AGGT 


6174 


GGACCCTTTT 


AATGTGGAAA 


TTCCTATCn CTGCCTCTAG 


GGCAITTATC 


ACnATTTCT 


6234 


TCTACAATCT 


CCCCTTTACT 


TCCTCTATTT TCTCTTTCTG 


GACCTCCCAT 


TATTG«3ACC 


6294 


TCTTTCCTCT 


AGTTTTATTG 


TCTCTTCTAT nCCCATCTC 


TTTGACTTTG 


TGTTT'Cm 


6354 


CAGGGAACTT 


TCilllliil 


CM II mil GA6ATGGAGT 


nCACTCTTG 


hgtcccagg 


6414 


CTGGAGTGCA 


ATGACGTGAT 


CTCAGCTCAC CACAACCTCC 


GCCTCCTGGA 


T7CAAGCGAT 


6474 


TCTCCTGCCG 


CftGCCTCCCG 


AGTAGCTGGG ATTACAGGCA 


TGCGCCACCA 


CGCCa\GCTA 


6534 


AmTGTGTT 


TTTAGTAGAG 


AAGGGGmrC TCC6TGTTGG 


TCAAGCTGGt 


CHGW^CTCC 


6594 


TGACCTCAG6 


TGATCCACCT 


GCCnGGCCT CCTAAAGTGC 


TGCGAHACA 


GGCGTGAGCC 


6654 


ACCGCGCCCA 


GCCTcrrrcA 


GGGAACTTTC TACAACTTTA 


TAAnCAATT 


CTTCTCiCAGA 


6714 


AAAAAATTTT 


T6GCCAGGCT 


CAGTAGCTCA GACCAATAAf 


TCCAGCACTT 


TGAGACfiCTG 


6774, 


AGGTGGGAGG 


ATTGCnGAG 


CTTGGGAGTT TGAGACTAGC 


CTGGGCAACA 


CAGTG/fiACC 


6834 


CTGTCTCTAT 


TTTTAAAAAA 


AGTAAAAAAA CATCTAAAAA 


nTAAcrm 


TATTDGAAA 


6894 


TAATTAGATA 


TTTCCAGGAA 


GCTGCAAAGA AATGCCTGGT 


GGGCCTGHG 


GCTGTGfiGTT 


6954 


TCCTGCAAGG 


CC6TGGGAAG 


GCCCTGTCAT TGGCAGAACC 


CCAGATCGTG 


AGGGCITTCC 


7014 


TTTTAGGCTG 


CTTTCTAAGA 


GGACTCCTCC AAGCTCTTGG 


AGGATGGAAG 


ACGCTCACCC 


7074 


ATGGTGnCG 


GCCCCTCAGA 


GCAGGGTGGG GCAGGGGAGC 


TGGTGCCTGT 


GCAGGCTGTG 


7134 


GACATTTGCA 


TGACTCCCTG 


TGGTCA6CTA AGAGCACCAC 


TCCncCTGA 


AGCGGGGCCT 


7194 
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GAAGTCCCTA GTCAGAGCCT CTGGFCACC TTCTGCAGGC A6GGAGAGGG 6A6TCAAGTC 7254 

AGTGAGGAGG GCTTTCGCAG TTTCTCTTAC ^^AACTCTCAA CATGCCCTCC CACCTGCACT 7314 

GCCTTCCTGG AAGCCCCACA GCCTCCTATG GTTCCGTGGT CCAGTCCHC AGCHCTGGG 7374 

CGCCCCCATC ACGGGCTGAG ATnTTGCTT TCCAGTCTGC CAAGTCAGH ACTGTGTCCA 7434 

TCCATCTGCT GTCAGCTTCT GGAAnGTTG CTGTTGTGCC CTTTCCATTC TTrTGTTATG 7494 

AT6CAGCTCC CCTGCTGACG ACGTCCCATT GCTCTFTTAA GTCTAGATAT CTGGACTGGG 7554 

CATTCAAGGC CCATTITGAG CA6AGTCGGG CTGACCTTTC AGCCCTCAGT TCTCCATGGA 7614 

GTATGCGCTC TCHCnGGC AGGGAGGCCT CACAAACATG CCATGCCTAT TGTAGCAGCT 7674 

CTCCAAGAAT GCTCACCICC TTCTCCCTGT AA1TCCTTTC aCTGTGAGG AGCTCA6CAG 7734 

CATCCCAHA TGAGACCTTA CTAATCCCAG GGATCACCCC CAACAGCCCT GGG6TACAAT 7794 

GAGCTTTTAA GAAGTTTAAC CACCTATGTA AGGAGACACA GGCAGTGGGC GATGCTGCCT 7854 

GGCCTGACTC nGCCATTGG GTGGTACT6T HGHGACTG ACT6ACTGAC TGACTGGAGG 7914 

GGGTTTGTAA nTGTATCTC AGGGATTACC CCCAACAGCC CTGGGGTACA ATGAGCCHC 7974 

AAGAAGnTA ACAACCTATG TAAGGACACA CAGCCAGTGG GT6ATGCTGC CTGGTCTGAC 8034 

TCTTGCCATT CA6TG6CACT GlTTGnGAC TGACTGACTG ACTGACTGGC TGACTGGAGG 8094 

GGGHCATAG CTAATATTAA TGGAGTGGTC TAAGTATCAT TGGTTCCTIG AACCCTGCAC "8I54^' 

TGTGGCAAAG TGGCCCACAG GCTGGAGGAG GACCAAGACA GGAGGGCAGT CTC6GGAGGA 8214 

GTGCCTGGCA GGCCCCTCAC CACCTCTGCC TACCTCAG TG AAG HC CCT TGT 8266 

GGG AGG CCC TGG AAG CGG ATG GAG AAG AAG CGC AGT CAC CTG AAA CGA 8314 

GAC ACA GAA GAC CAA GAA GAC CM GTA GAT CCG CGG CTC ATT GAT GGG 8362 

AAG ATG ACC AGG CGG GGA GAC AGO CCC TGG CAG GTGGGAGGCG AGGCAGCACC 8415 

GGCTCGTCAC GTGCTGGGTC CGGGATCACT GAGTCCATCC TGGCA6CTAT GCTCAGGGTG 8475 
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CAGAAACCGA GAGGGAAGCG CTGCCAHGC GriTGGGGGA TGATGAAGGT GGGGGATGCT 8535 

TCAGGGAAAG ATGGACGCAA CCTGA6GGGA GAGGAGCAGC CAGGGTGGGT GAGGGGAGGG 8595 

GCATGGGGGC ATGGAGGGGT CTGCAGGAG6 GAGGGHACA GTTTCTAAAA AGAGCIGGAA 8655 

AGACACTGCT CTGCTGGC6G GATTTTAGGC AGAAGCCCTG CTGATGGGAG AGGGCTAGGA 8715 

GGGAGGGCCG GGCCTGAGTA CCCCTCCAGC CTCCACATG6 GAACTGACAC TTACTGGGn 8775 

CCCCTCTCTG CCAGGCATGG G6GAGATAGG AACCAACAAG TGGGAGTATT TGCCCTGGGG 8835 

ACTCAGACTC TGCAA6GGTC AGGACCCCAA AGACCCGGCA GCCCAGTGGG ACCACAGCCA 8895 

GGACGGCCCT TCAAGATAGG GGCTGAGGGA GGCCAAGGGG AACATCCAGG CAGCCTGGG6 8955 

GCCACAAAGT CHCCTGGAA GACACAAGGC CTGCCAACCC TCTAAGGATG AGAGGAGCTC 9015 

GCTGGGCGAT GnGGTGTGG CTGAGGGTGA CTGAAACAGT AT6AACAGTG CAGGAACAGC 9075 

ATGGGCAAAG GCAGGAAGAC ACCCTGGGAC AGGCTGACAC TGTAAAATGG GCAAAAATAfi 9135 

AAAACGCCAG AAAGGCCTAA GCCTATGCCC ATATGACCAG GGAACCCAGG AAAGTGCATA 9195 

TGAAACCCAC GTGCCCTGGA CTGGAGGCTG TCAGGAGGCA 6CCCTGT6AT GTCATCATCC 9255 

CACCCCATTC CAG GTG GTC CTG CTG GAC TCA AAG AAG AAG CTG GCC TGC 9304 

GGG GCA GTG CTC ATC CAC CCC TCC TGG GTG CTG ACA GCG GCC CAC T3C 9352 

ATG GAT GAG TCC AAG AAG CTC CTT GTC AGG CH 6 GTATGGGCTG 9396 

GAGCCAGGCA GAAGGGG6CT GCCAGAGGCC TGGGTAGGGG GACCAGGCAG GCTGTrZAGG 9456 

TTTGGGGGAC CCCGCTCCCC AGGTGCHAA GCAAGA6GCT TCTTGAGCTC CACAGA«5GT 9516 

GTTTGGGGGG AAGAGGCCTA TGTGCCCCCA CCCTGCCCAC CCATGTACAC CCAGTAHTT 9576 

GCAGTAGGGG GHCTCTGGT GCCCTCTTCG AATCT6GGCA CAGGTACCTG CACACACATG 9636 

rrTGTGAGGG GCTACACAGA CCTTCACCTC TCCACTCCCA CTCATGA6GA GCAGGCrGTG 9696 

TGGGCCTCAG CACCCTTGGG TGCAGAGACC AGCAAGGCCT GGCCTCAGGG CTGTGC;TCC 9756 

CACA6ACTGA CAGGGATGGA GCTGTACAGA GGGAGCCCTA GCATCTGCCA AAGCCAIAAG 9816 
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CTGCnCCCT AGCAGGCTGG GGGCTCCTAT GCAnCGCCC CGATCTATGG CAATTTCTGG 9876 
AGG6GGGGTC TGGCTCAACT CTTTATGCCA AAAAGAAGGC AAAGCATATT GAGAAAGGCC 9936 
AAATTCACAT TTCCTACAGC ATAATCTATG CCAGTGGCCC CGTGGGGCTT GGCTTA6AAT 9996 
TCCCAGGTGC TCTTCCCAGG GAACCATCAG TCTGGACTGA GAGGACCTTC TCTCTCAGGT 10056 

GGGACCCGGC CCTGTCCTCC CTGGCAGTGC CGTGTTCTGG GGGTCCTCCT CTCTGGGTCT 10116 

CACTGCCCCT GGGGTCTCTC CAGCTACCTT TGCTCCAT6T TCCTTTGTGG CTCTGGTCTG 10176 

TGTCTGGGGT HCCAGGGGT CTCGGGCHC CCTGCTGCCC AnCCTTCTC TGGTCTCACG 10236 

GCTCCGTGAC TCCTGAAAAC CAACCAGCAT CCTACCCCTT TGGAHGACA CCTGTTGGCC 10296 

ACTCCncTG GCAGGAAAAG TCACCGTTGA TAGG6TTCCA C6GCATA6AC AGGTGGCTCC 10356 

GCGCCAGTGC CTGGGACGTG TGG6TGCACA GTCTCCGGGT GAACCnCTT CAGGCCCTCT 10416 

CCCAGGCCT6 CAGGGGCACA GCAGTGGGTG GGCCTCAGGA AAGTGCCACT GGGGAGAGGC 10476 

TCCCCGCAGC CCACTCTGAC TGTGCCCTCT GCCCTGCAG GA GAG TAT GAC CTG 10529 

CGG CGC TGG CAG AAG TGG GAG CTG GAC CTG GAC ATC AAG GAG GTC TTC 10577 

GTC GAC CCC AAC TAC AGC AAG AGC ACC ACC GAC AAT GAC ATC GCA CTG 10625 

CTG CAC CTG GCC CAG CCC GCC ACC CTC TCG CAG ACC ATA GTG CCC ATC 10673 

TGC CTC CCG GAC AGC GGC CTT GCA GAG CGC GAG CTC AAT CAG GCC GGC 1072]^" 

CAG GAG ACC CTC GTG ACG GGC TGG GGC TAC CAC AGC AGC CGA GAG AAG 10769 

GAG GCC AAG AGA AAC CGC ACC HC GTC CTC AAC TTC ATC AAG ATT CCC 10817 

GTG GTC CCG CAC AAT GAG TGC AGC GAG GTC ATG AGC AAC ATG GTG TCT 10865 

GAG AAC ATG CTG T6T GCG GGC ATC CTC GGG GAC CGG CAG GAT GCC TGC 10913 

GAG GGC GAC AGT GGG GGG CCC ATG GTC GCC TCC TTC CAC GGC ACC TGG 10961 

nc CTG GTG GGC CTG GTG AGC TGG GGT GAG GGC TGT GGG CTC CTT CAC 11009 
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AAC TAC GGC GH TAC ACC AAA GTC AGC CGC TAC CTC GAC TGG ATC CAT 11057 

GGG CAC ATC AGA GAC AAG GAA GCC CCC CAG AAG AGC TGG GCA CCT 11102 

TAGCGACCCT CCCTGCAGGG CIGGGCTTTT GCATGGCAAT GGATGGGACA TTAAA3GGAC 11162 

ATGTAACAAG CACACCGGCC TGCTGTTCTG TCCTTCCATC CCTCmTGG GCTCTfCTGG 11222 

AGGGAAGTAA CATfTACTGA GCACCTGTTG TATGTCACAT GCCHATGAA TAGAAFCTTA 11282 

ACTCCTAGAG CAACTCTGTG GGGTGG6GAG GAGCAGATCC AAGTTTTGCG GGGTCTAAAG 11342 

CTGTGTGTGT TGAGGGGGAT ACTCTGTTTA TGAAAAA6AA TAAAAAACAC AACCACGAAG 11402 

CCACTAGAGC CTTTTCCAGG GCTTTGGGAA GAGCCTGTGC AAGCCGGGGA TGCTG.\AGGT 11462 

GAGGCnGAC CAGCTTTCCA GCTAGCCCAG CTATGAGGTA GACATGTTTA GCTCATATCA 11522 

CAGAGGAGGA AACTGAGGGG TCTGAAAGGT nACATGGTG GAGCCAGGAT TCAAA'XTAG 11582 

GTCTGACTCC AAAACCCAGG TGCTTTmC TGHCTCCAC TGTCCTGGAG GACAGCTGTT 11642 

TCGACGGTGC TCAGTGTGGA GGCCACTATT AGCTCTGTAG GGAAGCAGCC AGAGACCCAG 11702 

AAAGTGTT6G TTCAGCCCAG AAT 11725 

{2) INFORMATION FOR SEO ID NO; 2: 

M) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 460 amino adds 
(8) TYPE: amino acid 
(D) TOPOLOGY: linear 

MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SFO ID NO: 2: 

Met Trp Gin Leu Thr 5er Leu Leu Leu Phe Val Ala Thr Trp Gly lie 
15 10 15 

Ser Gly Thr Pro Ala Pro Leu Asp Ser Val Phe Ser Ser Ser Glu Arg 
20 25 30 
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Ala His Gin Val Leu Arg He Arg Lys Arg A1a Asn Ser Phe Leu Glu 
3B 40 45 

Glu leu Arg His Ser Ser Leu Glu Arg Glu Cys He Glu Glu He Cys 
50 55 60 

Asp Phe Glu Glu Ala Lys Glu He Phe Gin Asn Val Asp Asp Thr Leu 
55 ■ 70 75 80 

Ala Phe Trp Ser Lys His Val Asp Gly Asp Gin Cys Leu Val Leu Pro 
85 90 95 

Leu Glu His Pro Cys Ala Ser Leu Cys Cys Gly His Gly Thr Cys He 
100 105 110 

Asp Gly He Gly Ser Phe Ser Cys Asp Cys Arg Ser Gly Trp Glu Gly 
115 120 125 

Arg Phe Cys Gin Arg Glu Val Ser Phe Leu Asn Cys Ser Leu Asp Asn 
130 135 140 

Gly Gly Cys Thr His Tyr Cys Leu Glu Glu Val Gly Trp Arg Arg Cys 
145 150 155 160 

Ser Cys Ala Pro Gly Tyr Lys Leu Gly Asp Asp Leu Leu Gin Cys His 
165 170 175 

Pro Ala Val Lys Phe Pro Cys Gly Arg Pro Trp Lys Arg Met Glu Lys 
180 185 190 

Lys Arg Ser His Leu Lys Arg Asp Thr Glu Asp Gin Glu Asp Gin Val 
195 200 205 

Asp Pro Arg Leu He Asp Gly Lys Met Thr Arg Arg Gly Asp Ser Pro 
210 215 220 

Trp Gin Val Val Leu Leu Asp Ser Lys Lys Lys Leu Ala Cys Gly Ala 
225 230 235 240 

Val Leu He His Pro Ser Irp Val Leu Thr Ala Ala Hjs Cys Met Asp 
245 250 255 

Glu Ser Lys Lys Leu Leu Val Arg Leu Gly Glu Tyr Asp Leu Arg Arg 
260 265 270 
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Trp 61 u Lys Trp Glu Leu Asp Leu Asp He Lys Glu Val Phe Val ^^s 
275 280 285 

Pro Asn Tyr Ser Lys Ser Thr Thr Asp Asn Asp He Ala Leu Leu His 
290 295 300 

Leu Ala Gin Pro Ala Thr Leu Ser Gin Thr He Val Pro He Cys .eu 
305 310 315 .320 

Pro Asp Ser Gly Leu Ala Glu Arg Glu Leu Asn Gin Ala Gly Gin Glu 
325 330 335 

Thr Leu Val Thr Gly Trp Gly Tyr His Ser Ser Arg Glu Lys Glu Ala 
340 345 350 

Lys Arg "Asn Arg Thr Phe Val Leu Asn Phe He Lys He Pro Val Val 
355 360 365 

Pro His Asn Glu Cys Ser Glu Val Met Ser Asn Met Val Ser Glu Asn 
370 375 380 

Met Leu Cys Ala Gly He Leu Gly Asp Arg Gin Asp Ala Cys Glu (ily 
385 390 395 400 

Asp Ser Gly Gly Pro Met Val Ala Ser Phe His Gly Thr Trp Phe Leu 
_ 405 410 415 

Val Gly Leu Val Ser Trp Gly Glu Gly Cys Gly Leu Leu His Asn lyr 
420 425 430 

Gly Val Tyr Thr Lys Val Ser Arg Tyr Lou Asp Trp He His Gly Mis 
435 440 445 



He Arg Asp Lys Glu Ala Pro Gin Lys Ser Trp Ala 
450 455 460 



(2) INFORMATION FOR SEC 10 N0.3- 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1386 base pairs 
CB) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY, linear 
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(11) MOLECULE TYPE- cONA 



(ix) FFATURE- 

(A) NAME/KEY; CDS 

(B) LOCATION: 1.. 138(1 



(XI) SEQUENCE DESCRIPTION: SEO ID N0:3: 

ATG TGG GAG CTC ACA AGC CTC CTG CTG TTC GIG GCC ACC TGG GGA AF 48 
Met Trp Gin Leu Thr Ser Leu Leu Leu Pne Val Ala Thr Trp Gly He 
15 10 15 

TCC GGC ACA CCA GCT CCT CTT GAC TCA GTG TTC TCC AGC AGC GAG CGT 96 
Ser Gly Thr Pro Ala Pro Leu Asp Ser Val Phe Ser Ser Ser Glu Arg 
20 25 30 

GCC CAC CAG GTG CTG CGG ATC CGC AAA CGT GCC AAC TCC HC CTG GAG 144 
Ala His Gin Val Leu Arg He Arg Lys Arg Ala Asn Ser Phe Leu Glu 
35 40 45 

GAG CTC CGT CAC AGC AGC CTG GAG CGG GAG TGC ATA GAG GAG ATC T6T 192 
Glu Leu Arg His Ser Ser Leu Glu Arg Glu Cys He Glu Glu He Cys 
50 55 60 

GAC TTC GAG GAG GCC AAG GAA ATT TIC CAA AAT GTG GAT GAC ACA CTG 240 
Asp Phe Glu Glu Ala Lys Glu He Phe Gin Asn Val Asp Asp Thr Leu 
65 70 75 80 

GCC TTC TGG TCC AAG CAC GTC GAC GGT GAC CAG TGC TTG GTC TTG CCC 288 
Ala Phe Trp Ser Lys His Val Asp Gly Asp Gin Cys Leu Val Leu Pro ' ' " 
85 90 95 

TTG GAG CAC CCG TGC GCC AGC CTG TGC TGC GGG CAC GGC ACG TGC ATC 336 
Leu Glu His Pro Cys Ala Ser Leu Cys Cys Gly His Gly Thr Cys He 
100 105 110 

GAC GGC ATC GGC AGC TTC AGC TGC GAC TGC CGC AGC GGC TGG GAG GGC 384 
Asp Gly He Gly Ser Phe Ser- Cys Asp Cys Arg Ser Gly Trp Glu Gly 
115 120 125 

CGC TTC TGC CAG CGC GAG GTG AGC nC CTC AAT TGC TCT aG GAC AAC 432 
Arg Phe Cys Gin Arg Glu Val Ser Phe Leu Asn Cys Ser Leu Asp Asn 
130 135 140 
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GGC GGC TGC ACG CAT TAC TGC CTA GAG GAG GTG GGC TGG CGG CGC FGT 480 
Gly Gly Cys Thr His Tyr Cys Leu Glu Glu Val Gly Trp Arg Arg lys 
145 150 155 160 

AGC TGT GCG CCT GGC TAC AAG CTG GGG GAC GAC CTC CTG CAG TGT iIAC 528 
Ser Cys Ala Pro Gly Tyr Lys Leu Gly Asp Asp Leu Leu Gin Cys His 
165 170 175 

CCC GCA GTG AAG TTC CCT TGT GGG AGG CCC TGG AAG CGG ATG GAG /\AG 576 
Pro Ala Val Lys Phe Pro Cys Gly Arg Pro Trp Lys Arg Met Glu l.ys 
180 185 190 

AAG CGC AGT CAC CTG AAA CGA GAC ACA GAA GAC CAA GAA GAC CAA CTA 624 
Lys Arg Ser His Leu Lys Arg Asp Thr Glu Asp Gin Glu Asp Gin Val 
195 200 205 

GAT CCG CGG CTC ATT GAT GGG AAG ATG ACC AGG CGG GGA GAC AGC CCC 672 
Asp Pro Arg Leu He Asp Gly Lys Met Thr Arg Arg Gly Asp Ser ['ro 
210 215 220 

TGG CAG GTG GTC CTG CTG GAC TCA AAG AAG AAG CTG GCC TGC GGG CiCA 720 
Trp Gin Val Val Leu Leu Asp Ser Lys Lys Lys Leu Ala Cys Gly Ala 
225 230 235 240 

GTG CTC ATC CAC CCC TCC TGG GTG CTG ACA GCG GCC CAC TGC ATG CAT 768 
Val Leu He His Pro Ser Trp Val Leu Thr Ala Ala His Cys Met /ip 
245 250 255 

GAG TCC AAG AAG CTC CTT GTC AGG CH GGA GAG TAT GAC CTG CGG CGC 816 
Glu Ser Lys Lys Leu Leu Val Arg Leu Gly Glu Tyr Asp Leu Arg Arg 
260 265 270 

TGG GAG AAG TGG GAG CTG GAC CTG GAC ATC AAG GAG GTC TTC GTC CAC 864 
rrp Glu Lys Trp Glu Leu Asp Leu Asp He Lys Glu Val Phe Val his 
275 280 285 

CCC AAC TAC AGC AAG AGC ACC ACC GAC AAT GAC ATC GCA CTG CTG CAC 912 
Pro Asn Tyr Ser Lys Ser Thr Thr Asp Asn Asp lie Ala Leu Leu His 
290 295 300 

CTG GCC CAG CCC GCC ACC CTC TCG CAG ACC ATA GTG CCC ATC TGC CTC 960 
Leu Ala Gin Pro Ala Thr Leu Ser Gin Thr He Va] Pro He Cys Leu 
305 310 315 320 
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CCG GAC AGC GGC CTT GCA GAG CGC GAG CTC AAT CAG GCC GGC CAG GAG 1008 
Pro Asp Ser Gly Leu Ala Glu Arg Glu Leu Asn Gin Ala Gly Gin Glu 
325 330 335 

ACC CTC GTG AC6 GGC TGG GGC TAC CAC AGC AGC CGA GAG AAG GAG GCC 1056 
Thr Leu Val Thr Gly Trp Gly Tyr His Ser Ser Arg Glu Lys Glu Ala 
340 345 350 

AAG AGA AAC CGC ACC TTC 6TC CTC AAC HC ATC AAG ATT CCC GTG GTC 1104 
Lys Arg Asn Arg Thr Phc Val Leu Asn Phe He Lys lie Pro Val Val 
355 360 365 

CCG CAC AAT GAG TGC AGC GAG GTC ATG AGC AAC ATG GTG TCT GAG AAC 1152 
Pro His Asn Glu Cys Ser Glu Val Met Ser Asn Met Val Ser Glu Asn 
370 375 380 

ATG CT6 TGT GCG GGC ATC CTC GGG GAC CGG CAG GAT GCC TGC GAG GGC 1200 
Met Leu Cys Ala Gly He Leu Gly Asp Arg Gin Asp Ala Cys Glu Gly 
385 390 395 400 

GAC AGT GGG GGG CCC ATG GTC GCC TCC nC CAC GGC ACC TGG TTC CTG 1248 
Asp Ser Gly Gly Pro Met Val Ala Ser Phe His Gly Thr Trp Phe Leu 
405 410 415 

GTG GGC CTG GTG AGC TGG GGT GAG GGC TGT GGG CTC CH CAC AAC TAC 1296 
Val Gly Leu Val Ser Trp Gly Glu Gly Cys Gly Leu Leu His Asn Tyr 
420 425 430 

GGC GTT TAC ACC AAA GTC AGC CGC TAC CTC GAC TGG ATC CAT GGG CAC 1344 
Gly Val Tyr Thr Lys Val Ser Arg Tyr Leu Asp Trp lie His Gly His 
435 440 445 

ATC AGA GAC AAG GAA GCC CCC CAG AAG AGC TGG GCA CCTTAG 1386 
He Arg Asp Lys Glu Ala Pro Gin Lvs Ser Trp Ala 
450 455 " 460 



(2) INFORMATION FOR SEQ 10 NO 4; 

(U SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 450 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE- Dronsin 
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(XI) SEQUENCE DESCRIPTION SEO ID N0:4: 

Met Trp Gin Leu Thr Ser Lej Leu Leu Phe Val Ala Thr Trp Gly He 
15 10 15 

Ser Gly Thr Pro Ala Pro Leu Asp Ser Val Phe Ser Ser Ser Glu Arg 
20 25 30 

Ala His G1n Val Leu Arg lie Arg Lys Arg Ala Asn Ser Phe Leu Glu 
35 40 45 

Glu Leu Arg His Ser Ser Leu Giu Arg Glu Cys He Glu Glu He Cys 
50 55 60 

Asp Phe' Glu Glu Ala Lys Glu He Phe Gin Asn Val Asp Asp Thr Leu 
65 7G 75 80 

Ala Phe Trp Ser Lys His Val Asp Gly Asp Gin Cys Leu Val Leu Pro 
85 90 95 

Leu Glu His Pro Cys Ala Ser Leu Cys Cys Gly His G1y Thr Cys He 

100 1Q5 110 

Asp Gly He Gly Ser Phe Ser Cv:; Aso Cys Arg Ser Gly Trp Glu Gly 

115 Id-- 125 

Arg Phe Cys Gin Arg Glu Val Scr Phe Leu Asn Cys Ser Leu Asp Asn 
130 135 140 

Gly Gly Cys Thr His Tyr Cys Leu Glu Glu Val Gly Trp Arg Arg Cys 
145 150 155 160 

Ser Cys Ala Pro Gly Tyr Lys Leu Gly Asp Asp Leu Leu Gin Cys His 
165 170 175 

Pro Ala Val Lys Phe Pro Cys Glv Arg Pro Trp Lys Arg Met Glu Lys 
180 185 190 

Lys Arg Ser His Leu Lys Arg Asp Thr Glu Asp Gin Glu Asp Gin Val 
195 200 205 

Asp Pro Arg Leu He Asp Gly Lys Het Thr Arg Arg Gly Asp Ser Pro 
210 215 220 
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Trp Gin Val Val Leu Leu Asp Ser Lys Lys Lys Leu Ala Cys Gly Ala 
225 230 235 240 

Val Leu He His Pro Ser Iro Val Leu Thr Ala Ala His Cys Met Asp 
245 250 255 

Glu Ser Lys Lys Leu Leu Vol Arg Leu Gly Glu Tyr Asp Leu Arg Arg 
260 265 270 

Trp Glu Lys Trp Glu Leu Asp Leu Asp He Lys Glu Val Phe Val His 
275 280 285 

Pro Asn Tyr Ser Lys Ser Thr Thr Asp Asn Asp He Ala Leu Leu His 
290 295 300 

Leu Ala Gin Pro Ala Thr Leu Ser Gin Thr He Val Pro lie Cys Leu 
305 310 315 320 

Pro Asp Ser Gly Leu Ala Glu Arg Glu Leu Asn Gin Ala Gly Gin Glu 
325 330 335 

Thr Leu Val Thr Gly Trp Gly Tyr His Ser Ser Arg Glu Lys Glu Ala 
340 345 350 

Lys Arg Asn Arg Thr Phe Val Leu Asn Phe He Lys He Pro Val Val 
355 350 365 

Pro His Asn Glu Cys Ser GUi Val Met Ser Asn Met Val Ser Glu Asn 
370 375 380 

Met Leu Cys Ala Gly He Leu Giy Asp Arg Gin Asp Ala Cys Glu Gly 
385 390 395 400 

Asp Ser Gly Gly Pro Met Val Ala Ser Phe His Gly Thr Trp Phe Leu 
405 410 415 

Val Gly Leu Val Ser Trp Gly Glu Gly Cys Gly Leu Leu His Asn Tyr 
420 425 430 

Gly Val Tyr Thr Lys Val Ser Aro Tyr Leu Asp Trp He His Gly His 
435 440 445 

He Arg Asp Lys Glu Ala Pro Gin Lys Ser Trp Ala 
450 455 460 
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(2) IMFORMATION FOR SEQ ID N0:5: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 10807 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGV: linear 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 5: 

ACGCGT6TCG ACCTGCAGGT CAACGGATCT CTGT6TCTGT TTTCATGHA GTACCACACT 60 

GTmrGGIGG CTGTAGCTTT CAGCTACAGT CTGAAGTCAT AAAGCCTGGT ACCTCCAGCT 120 

CTCnCTCTC TCAA6ATTGT GHCTGCTGI TTGGGTCTTT AGTGTCTCCA CACA/^TTTTT 180 

AGAATTGnr GTTCTAGHC TGTGAAA.AAT GATGCTG6TA T1TT6ATAAG GAHGCAHG 240 

AATCTGTAAA GCTACAGATA TAGTCATTGG GTAGTACAGT CACTTTAACA ATATTAACTC 300 

TTCACATCTG TGAGCATGAT ATATTnCCC CCTCTATATC ATCTTCAAn CCTCCTATCA 360 

GTTTCTTTCA TTGCAGTTTT CTGAGTACAG GTCTTACACC TCCnGGTTA GAGTCATTCC 420 

TCAGTATTTT AHCCTTTGA TACAAHGTG AAT6AGGTAA TTTTCTTAGT TraCTTTCT 480 

GATAGaCAT TGHAGTGTA TATATAGAAA AGCAACAGAT nCTATGTAT TAAnTTGTA 540 

TCCTGCAACA 6ATTTCTATG TAnAATDT GTATCCTGCT ACTTTACGGA AHC^mAT 600 

TAGCTTTTTG GTGACATCTT GAGGATTnC TGAAGAAAAT GGCATGGTAT GGTACfiACAA 660 

GGTGTCATGT CATCTGCAAA CAGTGGCAGT TTTCCncn CCCTTCCAAC CTGG^.TTrCT 720 

TTGATTTCTT TCTGTCTGAG TACGACTAGG AHCCCAATA CTATACCGAA TAAWGTGGC 780 

AA6AGTG6AC ATCCTTGTCT TATTTnCTG ACCTTA6AGG AAATGCTTTC AGTTITTCAC 840 

CAHAATTAT AATGTTTACT GTGGGCl Hi I CATAT6TGGC CTTCATTATA TGGAGGTCTA 900 

nCCCTCTAT ACCCACCTT6 TTGAGAGTT7 ITATCATAAA AGTATGTT6A ATTTIGTCAA 960 

AAGnrrrcc tgcatctatt gagatgatti ttactcttca attcattaat cAmTTAn 1020 
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CTTCATnTG nMTGATTT CCAnCHCA ATTTGnAAC GTGGTATATC ACATTGATTG 1080 

ATTTGTGGAT ACCTTT6TAT CCCTGGGATA AACCTCACTT GATCATGAGC niCAATGTA 1140 

TTTTTGAATT CACITTGCTA ATATTC'371 GGGTATTTTT GCATCTCTAT TCATCAATGA 1200 

TATTGGCCTA AGAAAGGTTT TGTCT&-;TTT TA6TATCAGG GT6ATGCTGG CCTCATAGAG 1260 

AGAGTTTAGA AGCATTTCCT CCTCTTTGAT TITTCGGAAT AGTTTGAGTA GGATAGGTAT 1320 

TAACTCnCT nAAATGHT GGGGACHCC CTGGTGAGCC GGTGGTTGAG AATCCGCCTC 1380 

AGGGATGTGG GTTTGATCCC TGGTCAGGGA ACCAHAATA AGATCCCACA T6CTGCAGGC 1440 

AACAAGCCCC CAAGCTGCAA CCACTGAGCT GCAACCGCTG CAGTGCCCAC AGGCCACGAC 1500 

CAGAGAAA6C CCACATACAG CAGGGAAGAC CCAGCACAAC CGGAAAAAGG AGTTTGGTGG 1560 

AATACAGCTG TGAAGCCGTC TGGTCCTGCA CTCCTGCHG AGGGAATTTT TTAAAAATTA 1620 

TTGATTCAAT TTCAnACTG GTAACTGGTC TGnCATATT nCTATTTCT TCCGGGnCA 1680 

GTCnCGGAG AHGTACATG CCTAGGAAT6 TGTCCG1TTC nCTAGGnG TCCATTTTAT 1740 

TG6ACATGCA TGGGAGCACA CAGCACCGA.C CAGCGAGACT CATGCT6GCT TCCTG6GGCC 1800 

AG6CTGGGGC CCCAAGCAGC ATGGCATCCl AGAGTGTGTG AAAGCCCACT GACCCTGCCC 1860 

AGCCCCACAA TTTCAnCTG AGAAGTCAH CCnGCHCT GCACTTACAG GCCCAGGATC 1920 

TGACCTGCTT CTGAGGAGCA 6GGGTTTTGG CAGGACGGGG AGATGCTGAG AGCCGACGG6 " 1980'- 

GGTCCAGGTC CCCTCCCAGG CCCCCCTGTC TGGGGCAGCC CTTGGGAAAG AnCCCCCAG 2040 

TCTCCCTCCT ACAGTGGTCA GTCCCAGCT6 CCCCAGGCCA GAGCTGCTTT ATTTCCGTCT 2100 

CTCTCTCTGG ATGGTAnCT CTGGAAGCTG AAGGTTCCTG AAGHATGAA TAGCTTTGCC 2160 

CTGAAGGGCA TGGTTTGTGG TCACGGnCA CAGGAACHG GGAGACCCTG CAGCTCAGAC 2220 

GTCCCGAGAT TGGTGGCACC CAGATTTCCT AAGCTCGCTG GGGAACAGGG CGCTTGTTrC 2280 

TCCCTGGCTG ACCTCCCTCC TCCCTGCATC ACCCAGHCT GAAAGCAGAG CGGTGCTGG6 2340 
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GTCACAGCCT CTCGCATCTA ACGCCGGTGT CCAAACCACC CGT6CTGGT6 HCGGGGGGC 2400 

TACCTATGGG GAAGGGCHC TCACTGCAGT GGT6CCCCCC GTCCCCTCT6 AGATCAGAAG 2460 

TCCCAGTCCG GACGTCAAAC AGGCCGAGCT CCCTCCAGAG GCTCCAGGGA GGGATCCHG 2520 

CCCCCCCGCT GCTGCCTCCA GCTCCTGGTG CCGCACCCTT GAGCCTGATC TTGTAGACGC 2580 

CTCAGTCTAG TCTCTGCCTC CGTGTTCACA C6CCTTCTCC CCATGTCCCC TCC6T3TCCC 2640 

CGTTrrCTCT CACAAG6ACA CCGGACATTA GAITAGCCCC TGTTCCAGCC TCACCFGAAC 270O 

AGCTCACATC TGTAAAGACC TAGAHCCAA ACAAGAHCC AACCTGAAGT TCCCG3TGGA 2760 

TGTGAGTTCT GGGGCGACAT CCHCAACCC CATCACAGCT. TGCAGTTCAT CGCAAWIAT 2B20 

GGAACCTG6G GTriATCGTA AAACCCAGGT TOTCATGAA ACACTGAGCT TCGAfiSCHG 2880 

HGCAAGAAT TAAAGGTGCT AATACAGATC AGGGCAAG6A aCAAGCTGG CTAAG-XTCC 2940 

TCTTTCCATC ACAGGAAAGG G6GGCCTGGG GGCGGCTGGA GGTCTGCTCC CGTGAi5TGAG 3000 

CTCTTTCCTG CTACAGTCAC CAACAGTCTC TCTGGGAAGG AAACCAGAGG CCAGAGAGCA 3060 

AGCCGGAGCT AG7TTAGGAG ACCCCTGAAC CTCCACCCAA GAT6CTGACC AGCCA(5CGGG 3120 

CCCCCTGGAA AGACCCTACA GHCAGGGGG GAAGAGGGGC TGACCCGCCA GGTCO.TGCT 3180 

ATCAGGAGAC ATCCCCGCTA TCAGGAGATT CCCCCACCn GCTCCCGnC CCCTATCCCA 3240 

ATACGCCCAC CCCACCCCTC TGATGAGCAG TTTAGTCACT TA6AATGTCA ACTGAilGGCT 3300 

T7TGCATCCC CTTTGCCA6A GGCACAAGGC ACCCACAGCC TGCTGGGTAC CGACGCCCAT 3360 

GTGGATTCAG CCAGGAGGCC 7GTCCTGCAC CCTCCCTGCl CGGGCCCCCT aCTGCTCAG 3420 

CAACACACCC AGCACCAGCA TTCCCGCTGC TCCTGAGGTC T6CAGGCAGC TCGCHJTAGC 3480 

CTGAGCGGTG TGGAGGGAAG TGTCCTGGGA GATTTAAAAT GTGAGAGGCG GGAGG"GGGA 3540 

GGnGGGCCC TGTGGGCCTG CCCATCCCAC GTGCCTGCAT TAGCCCCAGT GCTGC'CAGC 3600 

CGTGCCCCCG CCGCAGGGGT CAGGTCACH TCCCGTCCTG GGGTTATTAT GACTCTGTC 3660 

AHGCCATTG CCATTTTTGC TACCCTAACT GGGCAGCAGG TGCITGCAGA GCCCTCGATA 3720 
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CCGACCAGGT CCTCCCTCGG AGCTCGACCT G/iACCCCATG TCACCCHGC CCCAGCCTGC 3780 

AGAGG6TGGG TGACTGCAGA GATCCCnCA CCCAAGGCCA CGGTCACATG GTTTGGAGGA 3840 

GCTGGTGCCC AAGGCAGAGG CCACCCTCCA GGACACACCT GTCCCCAGTG CTGGCTCTGA 3900 

CCTGTCCnG TCTAAGAGGC TGACCCCGGA AGIGHCCTG GCACTGGCAG CCAGCCTGGA 3960 

CCCAGAGTCC AGACACCCAC CTGTGCCCCC GCTTCTGGGG TCTACCAGGA ACCGTCTAGG 4O20 

CCCAGAGG6G ACTTCCTGCT TGGCCTTGGA TG6AAGAAGG CCTCCTATTG TCCTCGTAGA 4080 

GGAAGCCACC CCGGGGCCTG AGGATGAGCC AAGTGGGATT CCGGGAACC6 CGTGGCTGGG 4140 

GGCCCAfiCCC GGGCTGGCTG GCCTGCATGC CTCCTGTATA AGGCCCCAAG CCTGCTGTCT 4200 

CAGCCCTCCA CTCCCTGCAG AGCTCAGAAG CACGACCCCA GGGATATCCC TGCAGCCATG 4260 

AAGTGCCTCC TGCTTGCCCT GGGCCT6GCC CTCGCCT6TG GCGTCCAGGC CATCATCGTC 4320 

ACCCAGACCA TGAAAGGCCT GGACATCCAG AAGGHCGAG GGTTGGCCGG GTGGGTGAGT 4380 

TGCAGGGCGG GCAGGGGAGC TGGGCCTCAG AGAGCCAAGA GAGGCTGT6A CGnGGGTTC 4440 

CCATCAGTCA GCTAGGGCCA CCTGACAAAT CCCCGCTGGG GCAGCTTCAA CCAGGCGTTC 4500 

ACTGTCTTGC AHCTGGAGG CTGGAAGCCC AAGATCCAGG TGTTGGCAGG GCTGGCnCT 4560 

CCTGCGGCC6 CTCTCTGGGG AGCAGACGGC C6TCTTCTCC AGTCCTCTGC GCGCCCTGAT 4620 

TTCCTCncC TGTGAG6CCA CCAGGCCTGC TGGAAACACG CCTGCCTGCG CAGCHCACA "4630" 

CGACCTTTGT CATCTCTTTA AAGGCCAT6T CTCCAGAGTC ATGT61TGAA GTTCTGGG6G 4740 

TTAGTGGGAC ACAGHCAGC CCCTAAAAGA GTCTCTCTGC CCCTCAAATT TTCCCCACCT 4800 

CCAGCCATGT CTCCCCAAGA TCCAAATGH GCTACATGTG GGG6GGCTCA TCTGGGTCCC 4860 

TCTTTGGGn CA6TGTGAGT CTGGGGAGAG CATTCCCCAG GGTGCAGAGT TGGGGG6AGT 4920 

ATCTCAG6GC TGCCCAGGCC 66GGTGGGAC AGAGAGCCCA CTGTGG6GCT GGGGGCCCCT 4980 

TCCCACCCCC AGAGTGCAAC TCAAGGTCCC KTCCAGGTG GCGGGGACH GGCACTCCH 5040 
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GGCTArCGCG GCCAGCGACA TCTCCCTGCT GGATGCCCAG AGTGCCCCCC TGAG/^GHGTA 5100 

CGTGGAGGAG CTGAAGCCCA CCCCCGAGGG CA/.CCTGGAG ATCCTGCTGC AGAA/.TGGTG 5160 

GGCGTCTCTC CCCAACATG6 AACCCCCACT CCCCAGGGCT GTGGACCCCC CGGGGCGTG6 5220 

GGTGCAGGAG GGACCASGGC CCCAGGGCTG GGGAAGAGGG CTCAGAGTTT ACTGGTACCC 5280 

GGCGCTCCAC CCAAGGCTGC CCACCCAGGG CiilllillT ITTTAAACTT mnAATTT 5340 

6ATGCTTCAG AACATCATCA AACAAATGAA CATAAAACAT TCATnTTGT HACTTCGAA 5400 

6GGGAGATAA AATCCTCTGA AGTGGAAATG CATAGCAAAG ATACATACAA TGAGGCAGGT 5460 

AirCTGA^n CCCTGTTAGT CTGAGGATTA CAAGTCTAH TGAGCAACAG AGAGACATTT 5620 

TCATCATTTC TAGTCTGAAC ACCTCAGTAT CTAAAATGAA CAAGAAGTCC TGGAAACGAA 5580 

GCAGTGTGGG GATAGGCCCG TGTGAAGGCT GCTGGGAGGC AGCAGACCTG GGTCTTCGGG 5640 

CTCAAGCAGT TCCCGCTACC AGCCCTGTCC ACCTCAGACG GGGGTCAGGG TGCAG3AGAG 5700 

AGCTGGATGG GTGTGGGGGC AGAGATG6GG ACCTGAACCC CAGGGCTGCC TTTTG3GGGT 5760 

GCCTGTGGTC AAGGCTCTCC CTGACCim CTCTCTG6CT TCATCTGACT TCTCCrCGCC 5820 

CATCCACCCG GTCCCCTGTG GCCTGAGGTG ACA.GTGAGTG CGCCGAGGCT AGTT&SCCAG 5880 

CTGGCTCCTA TGCCCATGCC ACCCCCCTCC AGCCCTCCTG 6GCCAGCTTC TGCCCCTGGC 5940 

CCTCAGHCA TCCTGATGAA AATGGTCCAT GCCAATGGCT CAGAAAGCAG CTGTCnrTCA 6000.. 

GGGAGAACGG CGAGTGTGCT CA6AAGAAGA mnGCAGA AAAAACCAAG ATCCCTGCGG 6060 

TGHCAAGAT CGATGGTGAG TCCGGGTCCC TGGGGGACAC CCACCACCCC C6CCC0CGGG 6120 

GACTGTGGAC AGGHCAGGG GGCTGGCGTC GGGCCCTGGG ATGCTAAGGG ACFGGTGGTG 6180 

ATGAAGACAC TGCCHGACA CCTGCTICAC TTGCCTCCCC TGCCACCTGC CCGGGGCCH 6240 

GGGGCGGTGG CCAT6GGCAG GTCCCGGCTG GCGGGCTAAC CCACCAGGGT GACACCCGAG 6300 

CTCTCTTTGC TGGGGGGCGG GCGGTGCTCT GGGCCCTCAG GCT6A6CTCA GGAGG"ACCT 6360 

GTGCCCTCCC AG6GGTAA.CC GAGAGCCGH GCCCACTCCA GG6GCCCAGG TGCCC(X6A 6420 
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CCCCAGCCCG CTCCACAGCT CCTTCATCTC CTGGAGACAA ACTCTGTCCG CCCTCGCTCA 6480 

TTCACTT6TT CGTCCTAMT CCGAGATGAT AAAGCTTCGA GGGGGGGTTG GGGTTCCATC 5540 

AGGGCTGCCC TTCCGCCGGG CAGCCTGGGC CACATCTGCC CTTG6CCCCC TCAGGACTCA 6600 

CTCTGACTG6 AGGCCCTGCA CTGACTGACG CCAGGGTGCC CA6CCCAGGG TCTCTG6CGC 6660 

CATCCAGCTG CACTGGGTTr GGGTGCTGGT CCTGCCCCCA AGCTGCCCGG ACACCACAGG 5720 

CAGCCGGGGC TGCCCACT6G CCTCGGTCAG GGTGA6CCCC AGCTGCCCCC GCTCAGGGCT 5780 

TGCCCCGACA AT6ACCCCAT CCTCAGGACG CACCCCCCn CCCTTGCTGG GCAGT6TCCA 6840 

6CCCCACCC6 AGATCGGGGG AAGCCCTATT TCTTGACAAC TCCAGTCCCT GGGGGAGG6G 6900 

GCCICAGACT GAGTGGTGAG TGTTCCCAAG TCCAGGAG6T GGTGGAGGGT CCTGGCGGAT 6960 

CCAGAGTTGA CAGT6.A6GGC TTCCTGGGCC CCATGCGCCT GGCAGTGGCA GCAGGGAAGA 7020 

GGAAGCACCA TTTCAGGGGT GGGGGAT6CC AGAGGCGCTC CCCACCCCGT CTTCGCCGGG 7080 

TGGTGACCCC GGGGGAGCCC CGCTGGTCGT GGAGGGTGCT GGGGGCTGAC TAGCAACCCC 7140 

TCCCCCCCCG TTGGAACTCA CTTTTCTCCC GTCHGACCG CGTCCAGCCT TGAATGAGAA 7200 

CAAAGTCCTT GTGCTGGACA CCGACTACAA AAAGTACCTG CTCTTCTGCA TGGAAAACAG 7260 

TGCTGAGCCC GAGCAAAGCC TGGCCTGCCA GTGCCrGGGT GGGTGCCAAC CCTGGCTGCC 7320 

CAGGGAGACC AGCTGCGTGG TCCTTGCTGC AACAGGGGGT GGGGGGTGGG AGCTTGATCC " ■ 7380- 

CCAGGAGGAG GAGGG6TGGG GGGTCCCTGA GTCCCGCCAG GAGAGAGTGG TCGCATACCG 7440 

GGAGCCAGTC TGCTGTGGGC CTGT6GGTGG CTGGGGACGG 6G6CCAGACA CACAGGCCGG 7500 

GAGACGGGTG GGCTGCAGAA CTGTGACT6G TGTGACCGTC GC6ATGGGGC CGGTGGTCAC 7560 

TGAATCTAAC AGCCTTTGn ACCGGGGAGT nCAAnATT TCCCAAAATA AGAACTCAGG 7620 

TACAAAGCCA TCUTCAACT ATCACATCCT GAAAACAAAT GGCAGGTCAC ATTTTCTGTG 7680 

CCGTAGCAGT CCCACTGGGC ATTTTCAGGG CCCCT6TGCC AGGGGGGCGC GGGCATCGGC 7740 



Printed from Mimosa 06/03/1998 13:49:50 page -73- 



wo 97/20043 



72 



l*CT/i;S96/I8866 



GAGTGGAGGC 


TCCTGGCTGT 


GTCAGCCGGC 


CCAGGGGGAG 


GAAGGGACCC 


GGAWGCCAG 


7800 


AGGTGGGGGG 


CAGGCT7TCC 


CCCTGTGACC 


TGCAGACCCA 


CTGCACTGCC 


CTGGGAGGAA 


7860 


GGGAGGGGM 


CTAGGCCAAG 


6GGGAAGGGC 


AGGTGCTCTG 


GAGGGCAAGG 


GCAGACCTGC 


7920 


AGACCACCCT 


GGGGAGCAGG 


GACTGACCCC 


CGTCCCTGCC 


CCATAGTCAG 


GACCCCGGAG 


7980 


GTGGACAACG 


AGGCCCTGGA 


GAAATTCGAC 


AAAGCCCTCA 


AGGCCCTGCC 


CAT6CACATC 


8040 


CGGCTTGCCT 


TCAACCCGAC 


CCAGCTGGAG 


GGTGA6CACC 


CA6GCCCCGC 


CCTTCCCCAG 


eioo 


GGCAGGAGCC 


ACCCGGCCCC 


GGGACGACCT 


CCTCCCATGG 


TGACCCCCA6 


ctccc:aggc 


8160 


CTCCCAGGAG 


GAAGGGGTGG 


GGTGCAGCAC 


CCCGTGG6GG 


CCCCCTCCCC 


acccc:tgcc 


8220 


AGGCCTCTCT 


TCCCGAGGTG 


TCCAGTCCCA 


TCCTGACCCC 


CCCATGACTC 


TCCCTCCCCC 


8280 


ACAGQGCA6T 


GCCACGTCTA 


GGTGAGCCCC 


TGCCGGTGCC 


TCTGGGGIAA 


6CTGCCTGCC 


8340 


CT6CCCCA,CG 


TCCT6GGCAC 


ACACAT6GGG 


TAGGGGGTCT 


TQGTGGQGCC 


TGGGACCCCA 


8400 




TGGGGTCCCC 


CCTGTGAGAA 


TGGCTGGAAG 


CTGGGGTCCC 


TCCTQJCGAC 


8460 


TGCAGAGCT6 


GCTGGCCGCG 


TGCCACTCn 


GTGGGTGACC 


TGTGTCCTGG 


CCTCACACAC 


8520 


TGACCTCCTC 


CAGCTCCTTC 


CAGCAGAGCT 


AAGGCTAAGT 


GAGCCA6AAT 


GGTACCTAAG 


8580 


GCCAGGCTAG 


CGGTCCnCT 


CCCGAGGAGG 


GGCTGTCCTG 


GAACCACCAG 


CCATGCAGAG 


8640. 


GCTGGCAAGG 


GTCTGGCA66 


TGCCCCAGGA 


ATCACAGGGG 


GGCCCCATGT 


CCATTXAGG 


8700 


GCCCGGGAGC 


CnGGACTCC 


TCTGGGGACA 


GACGACGTCA 


CCACCGCCCC 


CCCCC(yvTCA 


8750 


GGGGGACTAG 


AAG66ACCAG 


GACTGCAGTC 


ACCCTTCCTG 


GGACCCA6GC 


CCCTCCAGGC 


8820 


CCCTCCTGGG 


GCTCCTGCTC 


TGGGCAGCTT 


CTCCnCACC 


AATAAAGGCA 


fAAACCTGTG 


8880 


CTCTCCCTTC 


TGAGTC1TTG 


CTGGACGACG 


GGWGGGGGT 


GGAGAAGTGG 


TGGGG/£GGA 


8940 


GTaGGCTCA 


GA6GATGACA 


GCGGGGCTGG 


6ATCCAGGGC 


GTCTGCATCA 


CAGTC1TGTG 


9000 


ACAAC7GGGG 


GCCCACACAC 


ATCACTGCGG 


CTCTTTGAAA 


CmCAGGAA 


CCAGGCAGGG 


9060 


ACTCGGCAGA 


GACATCTGCC 


AGTTCACTTG 


GAGTGTTCAG 


ICAACACCCA 


AACTCCACAA 


9120 
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AGGACAGAAA GTGGAAAATG GCTGTCTCTT AGTQMlAA ATATTGATAT GAAACTCA/^ 9180 

TT6CTCATGG ATCAATATGC CTTTATGATC CAGCCAGCCA CTACTGTCGf ATCAACTCAT 9240 

GTACCCAAAC GCACTGATCT GTCTGGCTAA TGATGAGAGA TTCCCAGTAG AGAGCTGGCA 9300 

AGAGGTCACA GTGAGAACTG TCTGCACACA CAGCAGAGTC CACCAGTCAT CCTAAGGAGA 9360 

TCAGTCCT6G TGTTCATTGG AGGACT6ATG HGAAGCTGA AACTCCAATG CTTTGGCCAC 9420 

CTGATGTGAA GAGCTGACTC ATTTGAAAAG ACCCTGATGC TGGGAAAGAT T6AGGGCAGG 9480 

AGGAGAAG66 6ACGACAGAG GATGAGATGG nGGATGGCA TCACCAACAC AATGGACATG 9540 

GGTTTGGGTG GACTCCAGGA GnGGTGATG GACAGGGAGG CCTG6CGTGC TACGGAAGCG 9600 

6TTTATGGGG TCACAAAGAC TGAGTGACTG AACTGAGCTG AACTGAATGG AAATGAG6TA 9660 

TACAGCAAAG TGGGGATnT HAGATAATA AGAATATACA CATAACATAG TGTATACTCA 9720 

TATTTTTATG CATACCTGAA TGCTCAGTCA CTCAGTCGTA TCTGACTCTG TGACCTATGG 9780 

ACCGTAGCCT TCCAGGTTTC TTCTGTCCAC AGAATTCTCC AAGGCAAGAA TACTGGAGTG 9840 

GGTAGCCATT TCCTCCTCCA GGGGATCCTC CCGACCCAGG GAHGAACCG GCATCTCCTG 9900 

TAHGGCAGG IGGAnCHT ACCACTGTGC CACCA5GGAA GCCCGIGHA CTCTCTATGT 9960 

CCCACHAAT TACCAAAGCT GCTCCAAGAA AAAGCCCCTG T6CCCTCTGA GCnCCCGGC 10020 

CTGCAGAG6G TGGT6GGGGT AGACTGTGAC CIGGGAACAC CCTCCCGCTT CAGGACTCCC -10080- 

GGGCCACGTG ACCCACAGTC CTGCAGACAG CCGGGTAGCT CT6CTCTTCA AGGCTCAHA 10140 

TCTTTAAAAA AAACTGAGGT CTATnTGTG ACTTCGCTGC CGTAACTTCT GAACATCCAG 10200 

TGCGATGGAC AGGACCTCCT CCCCAGGCCT CAGGGGCTTC AGGGAGCCAG CCHCACCTA 10260 

TGAGTCACCA GACACTCGGG GGTGGCCCCG CCnCAGGGT GCTCACAGTC TTCCCATCGT 10320 

CCTGATCAAA GAGCAAGACC AATGACHCT TA6GAGCAAG CAGACACCCA CAGGACAQG 10380 

AGGTTCACCA GAGCTGAGCT GTCCTTTTGA ACCTAAAGAC ACACAGCTCT CGAAGGmT 10440 
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CTCTTTAATC TGGATITAAG GCCTACTTGC CCCTCAAGAG GGAAGACAGT CCTQIATCTC 10500 

CCCAGGACAG CCACTCGGT6 GCATCCGA6G CCACTTAGTA HATCTGACC GCACCCTGGA 10560 

AHAATCGGT CCAAACTGGA CAAAAACCTT GGTGGGAAGT TTCATCCCAG AGGCCTCAAC 10520 

CATCCTGCTT TGACCACCCT GCATCTTTTT TTCITJTATG TGTATGCATG TATA"ATATA 10680 

TATATATnr TTTTTTTTTC ATmTTGGC T6TGCTGGCT GTrCGHGCA GnCCGTGCG 10740 

CAGGCnCTC TCTAGTTTCT CTCTAGTCTT CTCHATCAC AGAGCAGTCT CTA6/»CGATC 10800 

GACGCGT 10807 
(2) INFORMATION FOR SEQ ID N0:6; 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs ■ 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AATTCCGATC GACGCGTCGA CGATATACTC TAGACGATCG ACGCGTA 47 

(2) INFORHATION FOR SEQ 10 NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

tA) LENGTH: 47 base pairs " ' 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AAGaACGCG TCGATCGTCT AGAGTATATC GTCGACGCGT CGATCGG 47 
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(2) INFORMATIOM FOR S£0 ID N0:8: 



(T) SEQUENCE CHARACTERISTICS: 
(A) LENGTH; 24 base pairs 
(8) TYPE: nucleic acid 

(C) STRAHDEDNESS: single 

(D) TOPOLOGY: linear 



(x1) SEQUENCE DESCRIPTION: SEQ 10 N0:8: 



TGGATCCCCT GCCGGTGCCT CTGG 24 

C2) INFORMATION FOR SEO ID N0:9: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
CB) TYPE: nucleic acid 

(C) STRANDEONESS; single 

(D) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEO ID NO: 9: 
AACGC6TCAT CCTCTGTGAG CCAG 24 
(2) INFORMATION FOR SEO ID NO: 10; 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC6839 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10- 
ACTACGTAGT 10 
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(2) INFORMATION FOR SEO ID NO: 11: 

(i) SEQUENCF CHARACTERISTICS: 

(A) LEKGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) 5TRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: 2C962 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:I1: 
AGTCACCfCA GAAGAAAACG AGACA P5 
CZ) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) 5TRAW0EDNES5: single 

(D) TOPOLOGY- linear 



(vii) IMMEDIATE SOURCE; 
(B) CLONE 2C6303 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12: 
ATTTGCGGCC GCCTGCAGCC ATGTGGCA6C TCACAAGCCT CCTGC 45 
(2) INFORMATION FOR SEO ID NO: 13: 

(1) StOUENCE CHARACTERISTICS- 

(A) LENGTH: 45 base pairs 

(B) TYPE nucleic acid 

(C) STRANDEDNES5: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
fS) CLONE: ZC6337 
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(xi) SEQUENCE DESCRIPTION: SEO 10 NO: 13: 
CAGGMGGAG TTGGCGCGCT TGCGCCGTTG CAGCACCTGG TGGGC 45 
(2) INFORMATION FOR SEO 10 N0:14: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH; 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5: single 

(D) TOPOLOGY: linear 



(vli) IMMEDIATE SOURCE: 
(B) CLONE: ZC6306 



(xi) SEQUENCE DESCRIPTION: SEO ID NO; 14: 
CnCTTCCTG AATTCTGTTT CHGC 
(2) INFORMATION FOR SEQ ID NO: 15; 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 28 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 



(Vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC6338 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CGGATCCGCA AGCGCGCCAA CTCCHCC 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE; nucleic acid 
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(C) STRW^DEDNESS: single 
(0) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE; 
(B) CLONE: ZC6373 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
AAAGTAAAAA MGATCTAAA AAFrTAAC 28 
(2) INFORMATION FOR SEO 10 NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vn) IMMEDIATE SOURCE; 
(B) CLONE. ZC6305 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO- 17: 
GTGTCTCGn TTCTTCTTAA GTGACTGCGC H 32 
(2) INFORMATION FOR SEQ 10 NO:ia- 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC5302 



(xi) SEQUENCE DESCRIPTION: SEO ID N0:18: 
TTAAGW^ AACGAGACAC AGAAGACCAA GAAGACCAAG TAGATCCGC 49 
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(2) INFORMATION 'OR SEQ ID NO: 19: 

(i> SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 43 base pairs 
(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
CD) TOPOLOGY: linear 



<vii) IMMEDIATE SOURCE: 
(B) CLONE- ZC6304 



(x1l SEOUEKCE DESCRIPTION; SEQ ID NO; 19; 
GGATCTACn GGTCnCTTG GTCnCTGTG TCTCGTTnc HC 43 
(2) [NFGRMATION FOR SEQ 10 N0:2Q: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH- 4 amino acids 

(B) TYPE: amino acid 
(0) TOPOLOGY; linear 



(XI) SEQUENCE DESCRIPTION: SEO ID N0:20: 
Arg Arg Lys Ary 



(2) INFORMATION FOR SEQ ID N0:21: 

(i) SEQUENCE ChiARACTERISTICS: 

(A) LENGTH. 4 amino acids 

(B) TYPE, amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION; SEQ ID N0;21: 
Lys Arg Lys Arg 
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(2) INFORMATION FOR SEO ID NO: 22; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 22: 

Ser His Leu Arg Arg Lys Arg Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6763 base pairs 
(8) TYPE: nucleic acid 

(C) STRANOEONESS: double 

(D) TOPOLOGY: linear 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 23; 

ACGCGTC6AC CTGCAGGTCA ACGGATCTCT CTGTCTGTTT TCATGTTAGT ACCACACTQT 60 

TTTGGTGCCT GTAGCTTTCA GCTACAGTCT GAAGTCATAA A6CCTGGTAC CTCCAGCTCT 120 

GHCTCTCTC AAGATTGTGT TCTGCTGHT GGGTCTTTAG TGTCTCCACA CAATTTTAG "I8d 

AATT6TTTGT TCTAGTTCTG TGAAAAATGA TGCTGGTATT TT6ATAAGGA TTGCA'TGAA 240 

TCTGTAAAGC TACAGATATA GTCAHGGGT AGTACAGTCA CTTTMCAAT ATTAACTCTT 300 

CACATCTGTG AGCATGATAT ATTTrCCCCC TCTATATCAT CTTCAATTCC TCCTA""CAGT 360 

nCTTTCAn GCAGTTTTCT GAGTACA6GT CHACACCTC CnGGHAGA GTCAT'CCTC 420 

AGTATTTTAT TCCTTTGATA CAAHGTGAA TGAGGTAAH TTCTTAGTn CTCTT'CTGA 480 

TAGCTCAHG TTAGTGTATA TATAGAAAAG CAACAGATH CTATGTATTA ATTFTGTATC 540 
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CT6CAACAGA TTTCTATGTA nAATTTrGT ATCCTGCTAC TTTACGGAAT TCACTTAHA 600 

GCrmiGGT GACATCTTGA GGATTTTCTG AAGAAAATGG CATGGTATGG TAGGACAAGG 660 

TGTCATGTCA TCTGCAAACA GTGGCAGTTT TCCTTCnCC cnCCAACCT GGAmCTTT 720 

GATTTCmC TGTCTGAGTA CGACTAGGAT TCCCAATACT ATACCGAATA AAAGTGGCAA 780 

GAGTGGACAT CCnGTCHA TTTnCTGAC CTTAGAGGAA ATGCTTTCAG TTTTTCACCA 840 

HAAHATAA TGTTTACTGT GGGCHGTCA TATGT6GCCT TCATTATATG GAGGTCTATT 900 

CCaCTATAC CCACCHGn GAGAGTnTT ATCATAAAAG TATGHGAAT TTTGTCAAAA 960 

GTTrrrCCTG CATCTATIGA GATGATmr ACTCTTCAAT TCATTAATGA TmTATTCT 1020 

TCATTTTGTT AATGATTTCC AHCnCAAT TTGHAACGT GGTATATCAC AHGAHGAT 1080 

TT6TGGATAC CTTTGTATCC CTGGGATAAA CCTCACHGA TCATGAGCTT TCAAIGTAH 1140 

TTTGAATTCA CTTTGCTAAT ATTCTGTTGG GTATmTGC ATCTCTATTC ATCAATGATA 1200 

TTGGCCTAAG AAAGGITTTG TCTGGTnTA GTATCAGGGT GATGCTGGCC TCATAGAGAG 1260 

AGITTAGAAG CATTTCCTCC TCTTTGATTT TTCGGAATAG TTTGAGTAGG ATA6GTATTA 1320 

ACTCTTCmr AA.^TGTTTGG GGACHCCCT GGTGAGCCGG TGGHGAGAA TCCGCCTCAG 1380 

GGATGTGGGT TTGATCCCTG GTCAGGGAAC CAHAATAAG ATCCCACATG CT6CAGGCAA 1440 

CAAGCCCCCA AGCTGCAACC ACT6AGCTGC AACCGCTGCA GTGCCCACAG GCCACGACCA 1500 

6AGAAAGCCC ACATACAGCA GGGAAGACCC AGCACAACCG 6AAAAAG6AG "ITTGGTGGAA 1560 

TACAGCTGTG AAGCCGTCTG GTCCTGGACT CCTGCHGAG GGAATTTm AAAAAT7ATI 1620 

GATTCAATTT CATTACTGGT AACTGGTCTG nCATATTrT CTATTTCTTC CGGGTTCAGT 1680 

CTTGGGAGAT TGTACATGCC TAGGAAT6TG TCCGTncn CTAGGHGTC CATnTAnG 1740 

GACATGCATG GGAGCACACA GCACCGACCA GCGAGACTCA TGCTGGCTTC CTG6GGCCAG 1800 

GCTGGGGCCC CAAGCAGCAT GGCATCCTAG AGTGTGTGAA AGCCCACTGA CCCTGCCCAG 1860 

CCCCACAATT TCAHCTGAG AAGTGA7TCC TTGCHCTGC ACTTACAGGC CCAGGATCTG 1920 
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ACCTGCnCT 


GAGGAGCAGG 


GGTTTTGGCA 


GGACGGGGAG 


ATGCTGAGAG 


CCGACGGGGG 


1980 


TCCAGGTCCC 


CTCCCAGGCC 


CCCCTGTCTG 


GGGCAGCCCT 


TGGGAAAGAT 


TGCCCCAGTC 


2040 


TCCCTCCTAC 


AGTGGTCAGT 


CCCAGCTGCC 


CCAGGCCAGA 


6CTGCT7TAT 


TTCaJTCTCT 


2100 


CTCTCTGGAT 


G6TATTCTCT 


GGAAGCTGAA 


GGTTCCTGAA 


GTTATGAATA 


Gcrr'GcccT 


2160 


GAAGGGCAT6 


GTTrGTGGTC 


ACGGTTCACA 


GGAACHGGG 


AGACCCTGCA 


GCTC/\GACGT 


2220 


CCCGA6ATTG 


GTGGCACCCA 


GA7TTCCTAA 


GCTCGCTGGG 


GAACAGGGCG 


CTTG'TTCTC 


2280 


CCTG6CTGAC 


CTCCCTCCTC 


CCT6CATCAC 


CCAGTTCT6A 


AAGCAGAGCG 


GTGCVGGG6T 


2340 


CACAGCCTCT 


CGCATCTAAC 


GCCGGTGTCC 


AAACCACCCG 


TGCTGGTGTT 


CGGGGGGCTA 


2400 


CCTATGGGGA 


AGGGCTTCTC 


ACTGCAGTGG 


TGCCCCCCGT 


CCCCTCT6AG 


ATCACiAAGTC 


2460 


CCAGTCCGGA 


CGTCAAACAG 


GCCGAGCTCC 


CTCCAGAGGC 


TCCAGG6AGG 


GATCCTTGCC 


2520 


CCCCCGCTGC 


TGCCTCCAGC 


TCCTGGTGCC 


GCACCCTTGA 


GCCTGATCTT 


GTAGACGCCT 


2580 


CAGTCTAGTC 


TCTGCCTCCG 


TGTTCACACG 


CCTTCTCCCC 


ATGTCCCCTC 


CGTG7CCCCG 


2640 


nrrcTCTCA 


CAAGGACACC 


G6ACATTAGA 


TTAGCCCCTG 


rrccAGCCTC 


ACCTCAACAG 


2700 


CTCACATCTG 


TAA;\G.'ICCTA 


GATTCCAAAC 


AAGATTCCAA 


CCTGAAGTTC 


CCGGTGGATG 


2760 


TGA6TTCTGG 


G6C6ACATCC 


TTCAACCCCA 


TCACAGCTT6 


CAGTTCATCG 


CAAWCArCG 


2820 


AACCT6GGGT 


TTATCGTAA^ 


ACCCAGGTTC 


TTCATGAAAC 


ACTGAGC1TC 


GAGGCHGn 


2880 


GCAAGAATTA 


AAGGTGCTAA 


TACAGATCAG 


GGCAAGGACT 


GAAGCTGGCT 


AAGCCTCCTC 


2940 


mcCATCAC 


AGGAAAGGGG 


GCCCT6GGGG 


CGGCTGGAGG 


TCTGCTCCCG 


TGAGTGAGCT 


3000 


CTTTCCT6CT 


ACAGTCACCA 


ACAGTCTCTC 


TGGGAAGGAA 


ACCAGAGGCC 


AGA6AGCAAG 


3060 


CCGGAGCTAG 


TTTAGGAGAC 


CCCTGAACCT 


CCACCCA/«5A 


TGCT6ACCAG 


CCAGC36GCC 


3120 


CCCTG6AAAG 


ACCCTACAGT 


TCAGGGGGGA 


AGAG6GGCTG 


ACCC6CCAGG 


TCCCTGCTAT 


3180 


CAGGAGACAT 


CCCCGCTATC 


AGGAGAT7CC 


CCCACCHGC 


TCCCGHCCC 


CTATCCCAAT 


3240 
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ACGCCCACCC CACCCCTGTG ATGAGCAGIT TAGTCACHA GAATGTCAAC TGAAGGCTTT 3300 

TGCATCCCCT TTGCCAGAGG CACAAGGCAC CCACAGCCTG CT6GGTACCG ACGCCCATGT 3360 

GGAT7CAGCC AGGAGGCCTG TCCTGCACCC TCCCTGCTCG GGCCCCCTCT GTGCTCAGCA 3420 

ACACACCCAG CACCAGCATT CCCGCT6C7C CTGAGGTCTG CA6GCA6CTC GCTGTA6CCT 3480 

GAGCGGTGT6 GAGGGAAGTG TCCTGGGAGA ITTAAAATGT GAGAGGCGGG AGGTGG6AGG 3540 

nGGGCCCTG TGGGCCTGCC CATCCCACGT GCCTGCATTA GCCCCAGTGC TGCTCAGCCG 3600 

TGCCCCCGCC GCAGGGGTCA GGTCACTTTC CCGTCCTGGG GnAHATGA CTCTTGTCAT 3660 

TGCCAHGCC ATnTTGCTA CCCTAACTGG 6CA6CAGGTG CTTGCAGAGC CCTCGATACC 3720 

GACCAGGTCC TCCCTCGGAG CTCGACCTGA ACCCCATGTC ACCCHGCCC CAGCCTGCAG 3780 

AGGGTGGGTG ACTGCAGAGA TCCCHCACC CAAGGCCACG GTCACATGGT TTGGAGGAGC 3840 

TG6T6CCCAA GGCAGAGGCC ACCCTCCAGG ACACACCTGT CCCCAGTGCT GGCTCTGACC 3900 

TGTCCTTGTC TAAGAGGCTG ACCCCGGAAG TGHCCIGGC ACTGGCAGCC AGCCTGGACC 3960 

CAGAGTCCAG ACACCCACCT GTGCCCCCGC TTCTGGGGTC TACCAGGAAC CGTCTAGGCC 4020 

CAGAGGGGAC nCCTGCTTG GCCTTGGATG GAAGAAGGCC TCCTAHGTC CTCGTA6AGG 4080 

AAGCCACCCC GGGGCCTGA-G GATGAGCCAA GTGGGATTCC 6G6AACCGCG TGGCTGGGGG 4140 

CCCAGCCCGG GCTGGCTGGC CTGCATGCCT CCTGTATAA6 GCCCCAAGCC TGCTGTCTCA 4200 

GCCCTCCACT CCCTGCAGAG CTCAGAAGCA CGACCCCAGG GATATCATCG ATAAGCTTG6 4260 

ATCCCCTGCC GGTGCCTCTG GGGTAAGCTG CCTGCCCT6C CCCACGTCCT GGGCACACAC 4320 

ATGGGGTAGG GGGTCnGGT GGGGCCTGGG ACCCCACATC AGGCCCTGGG GTCCCCCCTG 4380 

TGAGAATGGC TGGAAGCTGG GGTCCCTCCT GGCGACTGCA GAGCTGGCTG GCCGCGTGCC 4440 

ACTCnGTGG GTGACCTGTG TCCTGGCCTC ACACACTGAC CTCCTCCAGC TCCHCCAGC 4500 

AGAGCTAAGG CTAAGTGAGC CAGAATGGTA CCTAAGGGGA GGCTAGCGGT CCHCTCCCG 4560 

AGGAGGGGCT GTCCTGGAAC CACCAGCCAT GGAGAGGCTG GCAAGGGTCT GGCAGGTGCC 4620 
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CCA(3GAATCA CAGGGGGGCC CCATGTCCAT TiCAGSGCCC GGGAGCCTTG GACTCCTCTG 4680 

G6GACAGACG ACGTCACCAC CGCCCCCCCC CCATCAGGGG GACTA6AAGG GACG^GGACT 4740 

GCAGTCACCC nCCTGGGAC CCAGGCCCCT CCAGGCCCCT CCTGGGGCTC CTGCTCTGGG 4800 

CAGCnaCC HCACCAATA AAGGCATAAA CCTGTGCTCT CCCTTCTGAG TCrrrGCTGG 4860 

ACGACGGGCA GGG3GTGGAG AAGTGGTGGG GAGGGAGTCT CGCTCAGAGG ATGACAGCGG 4920 

GGCT6GGATC CAGG6CGTCT GCATCACAGT CTTGTGACAA CTGG66GCCC ACACaCATCA 4960 

CTGCGGCTCT TTG.AAACTTT CAGGAACCAG GGAGGGACTC GGCAGAGACA TCTGCCAGTT 5040 

CACnGGAGT GTTCAGTCAA CACCCAAACT CGACAAAGGA CAGAAAGTGG AAAA"GGCTG 5100 

TCTCTTAGTC TAATAAATAT TGATATGAAA CTCAAGHGC TCATGGATCA ATATGCCTTT 5160 

ATGATCCAGC CAGCCACTAC TGTCGTATCA ACTCATGTAC CCAAACGCAC TGATCTGTCT 5220 

GGCTAATGAT GAGAGAHCC CAGTAGAGAG CTGGCAAGAG GTCACAGTGA GAAC"GTCTG 5280 

CACACACAGC AGaGTCCACC AGTCATCCTA AGGAGATCAG TCCTGGTGn CATTCGAGGA 5340 

CTGATGnGA AGCTGAAACT CCAATGCTTT GGCCACCTGA TGTGAAGAGC TGAC">'CA7TT 5400 

GAAAAGACCC TGATGCTGGG AAAGAHGAG GGCAGGAGGA GAAGGGGACG ACAG/fiGATG 5460 

AGATGGTTGG ATGGCATCAC CAACACAATG GACATGGGTT TCGGTGGACT CCAGCAGTTG 5520 

GTGATGGACA GGGAGGCCTG GCGTGCTACG GAAGCGGITT ATGGGGTCAC AAAG/iCTGAG '5580 

TGACTGAACT GAGCTGAACT GAATGGAAAT GAGGTATACA GCAAAGTGGG GATfiTTTAG 5640 

ATAATAAGAA TATACACATA ACATAGTGTA TACTCATAn TTTATGCATA CCT6/^AT6CT 5700 

CA6TCACTCA GTCGTATCTG ACTCTGTGAC CTATGGACCG TAfiCCTTCCA GGtHCnCT 5760 

GTCCACAGAA nCTCCAAGG CAAGAATACT GGAGTGGGTA GCCAITTCCT CCTCCAGGGG 5820 

ATCCTCCC6A CCCAGGGAH GAACCGGCAT CTCCTGTAn GGCAGGTGGA TTCPTACCA 5880 

CTGTGCCACC AGGGAAGCCC GTGTTACTCT CTATGTCCCA CTTAAnACC AAAGCTGCTC 5940 
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CAAGAAAAAG 


CCCCTGTGCC 


CTCT6AGCTT CCC6GCCT6C AGAGGGTGGT 6GGGGTAGAC 


600O 


TGTGACCTGG 


GAACACCCTC 


CCGCTTCAGG ACTCCCGGGC CACGT6ACCC ACAGTCCTGC 


6060 


AGACAGCCGG 


GTAGCTCTGC 


TCnCAAGGC TCATTATCTT TAAAAAAAAC TGAGGTCTAT 


6120 


TTTGTGACn 


CGCT6CCGTA 


ACrrCTGAAC ATCCA6TGCG ATGGACAGGA CCTCCTCCCC 


6180 


AGGCCTCAGG 


GGCTTCAGGG 


AGCCAGCCTT CACCTATGAG TCACCAGACA CTCGGGGGTG 


6240 


GCCCCGCCTT 


CAGGGTGCTC 


ACAGTCncC CATCGTCCTG ATCAAAGAGC AAGACCAATG 


6300 


ACTTCTTAGG 


AGCAAGCAGA 


CACCCACAGG ACACTGAGGT TCACCAGAGC TGAGCTGTCC 


6360 


TTiTGAACCT 


AAAGACACAC 


AGCTCTCGAA GGTTnCTCT HAATCTCGA TTrAAGGCCT 


6420 


ACTT6CCCCT 


CAAGAGGGAA 


GACAGTCCTG CATGTCCCCA GGACAGCCAC TCGGTGGCAT 


6480 


CCGAGGCCAC 


HAGTAnAT 


CTGACCGCAC CCTGGAATTA ATCQGTCCAA ACTGGACAAA 


654Q 


AACCnCGTG 


GGAAGTTTCA 


TCCCAGAGGC CTCAACCATC CTGCTTTGAC CACCCTGCAT 


6600 


ClllllilCT 


HTATGTGTA 


TGCATGTATA TATATATATA TATTTTrnT 1 1 1 1 ICATTT 


6660 


TTTGGCTGTG 


CTGGCTGTTC 


GTTGCA6TTC GGTGCGCAGG CHCTCTCTA GTHCTCTCT 


6720 


AGTCTTCTCT 


TATCACAGAG 


CAGTCTCTAG ACGATCGACG CGT 


6763 



(2) INFORMATION FOR SEQ ID N0:24: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LEMGTH: 5 amino acids 
(8) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:24- 

Arg He Arg Lys Arg 
1 S 

(2) INFORMATION FOR SEQ 10 NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 5 amino acids 
(6) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:25: 

Gin Arg Arg Lys Arg 
1 5 



Printed from Mimosa 06/03/1998 13:49:50 page -86- 



wo 97/20043 



87 



PCT/USW/18866 



CLAIMS 

1. A method for producing protein C in a 
transgenic animal comprising: 

providing a DNA construct comprising a first DNA 
segment encoding a secretion signal and a protein C propeptide 
operably linked to a second DNA segment encoding protein C, 
wherein the encoded protein C comprises a two-chain cleavage 
site modified from Lysine (Lys) -Arginine (Arg) Co R3^-R2-R3-R4 , 
and wherein each of R-^, R2 , R3, R4 is individually Lys or Arg, 
and wherein said first and second segments are operably linked 
to additional DNA segments required for expression of the 
protein C DNA in a mammary gland of a host female animal; 

introducing said DNA construct into a fertilized egg 
of a non-human mammalian species; 

inserting said egg into an oviduct or uterus of a 
female of said species to obtain offspring carrying said DNA 
construct ; 

breeding said offspring to produce female progeny 
that express said first and second DNA segments and produce 
milk containing protein C encoded by said second segment, 
wherein said protein has anticoagulant activity upon 
activation; 

collecting milk from said female progeny,- and 
recovering the protein C from the milk. " ' 

2. The method of claim 1, further comprising the 
step of activating the protein C. 

3. The method of claim 1, wherein R1-R2-R3-R4 is 
Arg-Arg-Lys-Arg (SEQ ID NO: 20). 

4. The method of claim 1, wherein said species is 
selected from sheep, rabbits, cattle and goats. 
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5. The method of claim 1, wherein each of said 
first and second DNA segments comprises an intror . 

6. The method of claim 1, wherein the second DNA 
segment comprises a DNA sequence of nucleotides as shown in 
Seq. ID NO: 1 or Seq. ID. NO: 3. 

7. The method of claim 6, wherein the second DNA 
segment comprises the DMA sequence of nucleotides as shown in 
SEQ. ID. NO: 1. 

8. The method of claim i, wherein the additional 
DNA segments comprise a transcriptional promoter selected from 
the group consisting of casein, p-lactoglobulin, a-lactalbumin 
and whey acidic protein gene promoters. 

9. The method of claim 8, wherein the 
transcriptional promoter is the p-lactoglobulin gene promoter. 

10. A transgenic non-human female mammal that 
produces recoverable amounts of human protein C in its milk, 
wherein at least 90V of the human protein C in the milk is 
two-chain protein C. 

11. A process for producing a transgenic offspri;ig 
of a mammal comprising: 

providing a DNA construct comprising a first DNA 
segment encoding a secretion signal and a protein C propeptide 
Qperably linked to a second DNA segment encoding protein C, 
wherein the encoded protein C comprises a two-chain cleavage 
site modified from Lys-Arg to R1-R2-R3-R4, and wiierein each of 
Rl. Ri' R3' R4' is individually Lys or Arg, and wherein said 
first and second segments are operably linked to additional 
DNA segments required for expression of the protein C DNA in 
the mammary gland of a host female animal; 
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introducing said DNA construct into a fertilized egg 
of a non-human mammalian species,- and 

inserting said egg into an oviduct or uterus of a 
female of said species to obtain offspring carrying said DNA 
construct . 

12. The process according to claim ii, wherein Ri- 
R2-R3'R4 is Arg-Arg-Lys-Arg (SEQ ID NO: 20) . 

13. The process according to claim 11, wherein the 
offspring is female. 

14. The process according to claim 11, wherein the 
offspring is male. 

15. A non-human mammal produced according to the 
process of claim 10. 

16. A non-human mammal of claim IS, wherein the 
mammal is female. 

17. A female mammal according to claim 16 chat 
produces milk containing protein C encoded by said DNA 
construct, wherein said protein C has anticoagulant activity 
upon activation. 

18. A non-human mammalian embryo containing in its 
nucleus a heterologous DNA segment encoding protein C, wherein 
the encoded protein C comprises a two-chain cleavage site 
modified from Lys-Arg to r^-Rj-Rj-R^, and wherein each of Ri , 
R2, R3, R4, is individually Lys or Arg. 
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